Praise for the First Edition
“2005 Best Java Book!”
—Java Developer’s Journal
Hibernate In Action has to be considered the definitive tome on Hibernate. As the authors are intimately involved with the project, the insight on Hibernate that they provide can’t be easily duplicated. —JavaRanch.com “Not only gets you up to speed with Hibernate and its features…It also introduces you to the right way of developing and tuning an industrial-quality Hibernate application. …albeit very technical, it reads astonishingly easy…unfortunately very rare nowadays…[an] excellent piece of work…” —JavaLobby.com “The first and only full tutorial, reference, and authoritative guide, and one of the most anticipated books of the year for Hibernate users.” —Dr. Dobb’s Journal “…the book was beyond my expectations…this book is the ultimate solution.” —Javalobby.org, (second review, fall 2005) “…from none others than the lead developer and the lead documenter, this book is a great introduction and reference documentation to using Hibernate. It is organized in such a way that the concepts are explained in progressive order from very simple to more complex, and the authors take good care of explaining every detail with good examples. …The book not only gets you up to speed with Hibernate and its features (which the documentation does quite well). It also introduces you to the right way of developing and tuning an industrial-quality Hibernate application.” —Slashdot.org “Strongly recommended, because a contemporary and state-of-the-art topic is very well explained, and especially, because the voices come literally from the horses’ mouths.” —C Vu, the Journal of the ACCU
“The ultimate guide to the Hibernate open source project. It provides in-depth information on architecture of Hibernate, configuring Hibernate and development using Hibernate…It also explains essential concepts like, object/relational mapping (ORM), persistence, caching, queries and describes how they are taken care with respect to Hibernate…written by the creators of Hibernate and they have made best effort to introduce and leverage Hibernate. I recommend this book to everyone who is interested in getting familiar with Hibernate.” —JavaReference.com “Well worth the cost…While the on-line documentation is good, (Mr. Bauer, one of the authors is in charge of the on-line documentation) the book is better. It begins with a description of what you are trying to do (often left out in computer books) and leads you on in a consistent manner through the entire Hibernate system. Excellent Book!” —Books-on-Line “A compact (408 pages), focused, no nonsense read and an essential resource for anyone venturing into the ORM landscape. The first three chapters of this book alone are indispensable for developers that want to quickly build an application leveraging Hibernate, but more importantly really want to understand Hibernate concepts, framework, methodology and the reasons that shaped the framework design. The remaining chapters continue the comprehensive overview of Hibernate that include how to map to and persist objects, inheritance, transactions, concurrency, caching, retrieving objects efficiently using HQL, configuring Hibernate for managed and unmanaged environments, and the Hibernate Toolset that can be leveraged for several different development scenarios.” —Columbia Java Users Group “The authors show their knowledge of relational databases and the paradigm of mapping this world with the object-oriented world of Java. This is why the book is so good at explaining Hibernate in the context of solving or providing a solution to the very complex problem of object/relational mapping.” —Denver JUG
Java Persistence with Hibernate
REVISED EDITION OF HIBERNATE IN ACTION
CHRISTIAN BAUER AND GAVIN KING
MANNING
Greenwich (74° w. long.)
For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact: Special Sales Department Manning Publications Co. Cherokee Station PO Box 20386 New York, NY 10021
Fax: (609) 877-8256 email: orders@manning.com
©2007 by Manning Publications Co. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.
Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end.
Manning Publications Co. 209 Bruce Park Avenue Greenwich, CT 06830
Copyeditor: Tiffany Taylor Typesetters: Dottie Marsico Cover designer: Leslie Haimes
ISBN 1-932394-88-5 Printed in the United States of America 1 2 3 4 5 6 7 8 9 10 – VHG – 10 09 08 07 06
brief contents
PART 1 GETTING STARTED WITH HIBERNATE AND EJB 3.0 .........1
1 2 3
■ ■ ■
Understanding object/relational persistence Starting a project 37 Domain models and metadata 105
3
PART 2
MAPPING CONCEPTS AND STRATEGIES ........................ 155
4 5 6 7 8
■ ■ ■ ■ ■
Mapping persistent classes
157 191 240 277
Inheritance and custom types
Mapping collections and entity associations Advanced entity association mappings Legacy databases and custom SQL 322
PART 3
CONVERSATIONAL OBJECT PROCESSING .....................381
9 10 11 12
■ ■ ■ ■
Working with objects
383 433 476 517
Transactions and concurrency Implementing conversations Modifying objects efficiently
v
vi
BRIEF CONTENTS
13 14 15 16 17
■ ■ ■ ■ ■
Optimizing fetching and caching Querying with HQL and JPA QL Advanced query options Introducing JBoss Seam SQL fundamentals 818 822 663
559 614 697
Creating and testing layered applications 747
appendix A appendix B
Mapping quick reference
contents
foreword to the revised edition xix foreword to the first edition xxi preface to the revised edition xxiii preface to the first edition xxv acknowledgments xxviii about this book xxix about the cover illustration xxxiii
PART 1 GETTING STARTED WITH HIBERNATE AND EJB 3.0 .......................................................1
1
Understanding object/relational persistence 3
1.1 What is persistence?
■
5
■ ■
Relational databases 5 Understanding SQL 6 Using SQL in Java 7 Persistence in object-oriented applications 8
1.2
The paradigm mismatch
10
■
The problem of granularity 12 The problem of subtypes The problem of identity 14 Problems relating to associations 16 The problem of data navigation 18 The cost of the mismatch 19
■ ■
13
vii
viii
CONTENTS
1.3
Persistence layers and alternatives
■ ■
20
Layered architecture 20 Hand-coding a persistence layer with SQL/JDBC 22 Using serialization 23 Object-oriented database systems 23 Other options 24
■
1.4
Object/relational mapping 24
What is ORM? 25 Generic ORM problems 27 Why ORM? 28 Introducing Hibernate, EJB3, and JPA 31
■ ■
1.5
Summary
35
2
Starting a project
2.1
37
38
■
Starting a Hibernate project
■
Selecting a development process 39 Setting up the project 41 Hibernate configuration and startup 49 Running and testing the application
■
60
2.2
Starting a Java Persistence project 68
Using Hibernate Annotations 68 Using Hibernate EntityManager 72 Introducing EJB components 79 Switching to Hibernate interfaces 86
■ ■
2.3
Reverse engineering a legacy database
■ ■
88
Creating a database configuration 89 Customizing reverse engineering 90 Generating Java source code 92
2.4
Integration with Java EE services
■
96
101
Integration with JTA 97 JNDI-bound SessionFactory JMX service deployment 103
2.5
Summary
104
3
Domain models and metadata 105
3.1 The CaveatEmptor application
Analyzing the business domain 107 domain model 108
106
■
The CaveatEmptor
CONTENTS
ix
3.2
Implementing the domain model
■ ■
110
Addressing leakage of concerns 111 Transparent and automated persistence 112 Writing POJOs and persistent entity classes 113 Implementing POJO associations 116 Adding logic to accessor methods 120
■
3.3
Object/relational mapping metadata
■ ■
123
Metadata in XML 123 Annotation-based metadata 125 Using XDoclet 131 Handling global metadata 133 Manipulating metadata at runtime 138
3.4
Alternative entity representation
Creating dynamic applications 141 in XML 148
■
140
Representing data
3.5
Summary
152
PART 2 MAPPING CONCEPTS AND STRATEGIES ............. 155
4
Mapping persistent classes 157
4.1 Understanding entities and value types
Fine-grained domain models 158 Identifying entities and value types
■
158
Defining the concept 159 160 Handling 166
4.2
Mapping entities with identity
■
161
■
Understanding Java identity and equality 162 database identity 162 Database primary keys
4.3
Class mapping options
■
171
■
Dynamic SQL generation 172 Making an entity immutable 173 Naming entities for querying 173 Declaring a package name 174 Quoting SQL identifiers Implementing naming conventions 175
■
175
4.4 4.5
Fine-grained models and mappings
Mapping basic properties 177
■
177
184
Mapping components
Summary
189
x
CONTENTS
5
Inheritance and custom types 191
5.1 Mapping class inheritance 192
■
Table per concrete class with implicit polymorphism 192 Table per concrete class with unions 195 Table per class hierarchy 199 Table per subclass 203 Mixing inheritance strategies 207 Choosing a strategy 210
■ ■
5.2
The Hibernate type system
■
212
Recapitulating entity and value types 212 Built-in mapping types 214 Using mapping types 219
5.3
Creating custom mapping types
■
220
■
Considering custom mapping types 221 The extension points 222 The case for custom mapping types 223 Creating a UserType 224 Creating a CompositeUserType 228 Parameterizing custom types 230 Mapping enumerations 233
■ ■ ■
5.4
Summary
239
6
Mapping collections and entity associations 240
6.1 Sets, bags, lists, and maps of value types
■ ■
241
Selecting a collection interface 241 Mapping a set 243 Mapping an identifier bag 244 Mapping a list 246 Mapping a map 247 Sorted and ordered collections 248
■
6.2
Collections of components
■
251
■
Writing the component class 252 Mapping the collection 252 Enabling bidirectional navigation 253 Avoiding not-null columns 254
6.3
Mapping collections with annotations
■ ■
256
258
Basic collection mapping 256 Sorted and ordered collections 257 Mapping a collection of embedded objects
CONTENTS
xi
6.4
Mapping a parent/children relationship 260
Multiplicity 261 The simplest possible association 261 Making the association bidirectional 264 Cascading object state 267
■ ■
6.5
Summary
275
7
Advanced entity association mappings 277
7.1 Single-valued entity associations
■
278
■
Shared primary key associations 279 One-to-one foreign key associations 282 Mapping with a join table 285
7.2
Many-valued entity associations
■ ■
290
303
One-to-many associations 290 Many-to-many associations 297 Adding columns to join tables Mapping maps 310
7.3
Polymorphic associations
■
313
■
Polymorphic many-to-one associations 313 Polymorphic collections 315 Polymorphic associations to unions 316 Polymorphic table per concrete class 319
7.4
Summary
321
8
Legacy databases and custom SQL 322
8.1 Integrating legacy databases
■ ■
323
■
Handling primary keys 324 Arbitrary join conditions with formulas 337 Joining arbitrary tables 342 Working with triggers 346
8.2
Customizing SQL
350
356
Writing custom CRUD statements 351 Integrating stored procedures and functions
8.3
Improving schema DDL
■
364
■
Custom SQL names and datatypes 365 Ensuring data consistency 367 Adding domains and column
xii
CONTENTS
constraints 369 Table-level constraints 370 Database constraints 373 Creating indexes 375 Adding auxiliary DDL 376
■ ■
8.4
Summary
378
PART 3 CONVERSATIONAL OBJECT PROCESSING .......... 381
9
Working with objects 383
9.1 9.2 The persistence lifecycle
Object states 385
■
384 391
■
The persistence context 388
Object identity and equality
■
Introducing conversations 391 The scope of object identity 393 The identity of detached objects 394 Extending a persistence context 400
9.3
The Hibernate interfaces
■
401
■
Storing and loading objects 402 Working with detached objects 408 Managing the persistence context 414
9.4
The Java Persistence API
417
■
Storing and loading objects 417 entity instances 423
Working with detached
9.5
Using Java Persistence in EJB components
■ ■
426
Injecting an EntityManager 426 Looking up an EntityManager 429 Accessing an EntityManagerFactory 429
9.6
Summary
431
10
Transactions and concurrency 433
10.1 Transaction essentials 434
■ ■
Database and system transactions 435 Transactions in a Hibernate application 437 Transactions with Java Persistence 449
CONTENTS
xiii
10.2
Controlling concurrent access
■
453
■
Understanding database-level concurrency 453 Optimistic concurrency control 458 Obtaining additional isolation guarantees 465
10.3
Nontransactional data access
469
■ ■
Debunking autocommit myths 470 Working nontransactionally with Hibernate 471 Optional transactions with JTA 473
10.4
Summary
474
11
Implementing conversations 476
11.1 Propagating the Hibernate Session
■
477
■
The use case for Session propagation 478 Propagation through thread-local 480 Propagation with JTA 482 Propagation with EJBs 483
■
11.2
Conversations with Hibernate
■
485
■
Providing conversational guarantees 485 Conversations with detached objects 486 Extending a Session for a conversation 489
11.3
Conversations with JPA 497
Persistence context propagation in Java SE 498 Merging detached objects in conversations 499 Extending the persistence context in Java SE 501
11.4
Conversations with EJB 3.0
506
510
Context propagation with EJBs 506 Extended persistence contexts with EJBs
11.5
Summary
515
12
Modifying objects efficiently 517
12.1 Transitive persistence 518
Persistence by reachability 519 Applying cascading to associations 520 Working with transitive state 524 Transitive associations with JPA 531
■ ■
xiv
CONTENTS
12.2
Bulk and batch operations
■
532
■
Bulk statements with HQL and JPA QL 533 Processing with batches 537 Using a stateless Session 539
12.3
Data filtering and interception
Dynamic data filters 541 The core event system 553
■ ■
540
Intercepting Hibernate events 546 Entity listeners and callbacks 556
12.4
Summary
558
13
Optimizing fetching and caching 559
13.1 Defining the global fetch plan
■ ■
560
■
The object-retrieval options 560 The lazy default fetch plan 564 Understanding proxies 564 Disabling proxy generation 567 Eager loading of associations and collections 568 Lazy loading with interception 571
■ ■
13.2
Selecting a fetch strategy
■
573
■ ■
Prefetching data in batches 574 Prefetching collections with subselects 577 Eager fetching with joins 578 Optimizing fetching for secondary tables 581 Optimization guidelines 584
■
13.3
Caching fundamentals
592
593
■
Caching strategies and scopes architecture 597
The Hibernate cache
13.4
Caching in practice 602
Selecting a concurrency control strategy 602 Understanding cache regions 604 Setting up a local cache provider 605 Setting up a replicated cache 606 Controlling the second-level cache 611
■ ■ ■
13.5
Summary
612
14
Querying with HQL and JPA QL 614
14.1 Creating and running queries
■
615
625
Preparing a query 616 Executing a query Using named queries 629
CONTENTS
xv
14.2 14.3
Basic HQL and JPA QL queries
Selection 633
■
633
■
Restriction 635
Projection
641
Joins, reporting queries, and subselects
Joining relations and associations 643 queries 655 Using subselects 659
■ ■
643
Reporting
14.4
Summary
662
15
Advanced query options 663
15.1 Querying with criteria and example
■ ■
664
Basic criteria queries 665 Joins and dynamic fetching 670 Projection and report queries 676 Query by example 680
15.2
Using native SQL queries
■
683
■
Automatic resultset handling 683 Retrieving scalar values 684 Native SQL in Java Persistence 686
15.3 15.4
Filtering collections Caching query results
■
688 691
■
Enabling the query result cache 691 Understanding the query cache 692 When to use the query cache 693 Natural identifier cache lookups 693
15.5
Summary
695
16
Creating and testing layered applications 697
16.1 Hibernate in a web application
■
698
■
Introducing the use case 698 Writing a controller 699 The Open Session in View pattern 701 Designing smart domain models 705
16.2
Creating a persistence layer
■
708
■
A generic data-access object pattern 709 Implementing the generic CRUD interface 711 Implementing entity DAOs 713 Using data-access objects 715
xvi
CONTENTS
16.3
Introducing the Command pattern
■
718
The basic interfaces 719 Executing command objects 721 Variations of the Command pattern 723
16.4
Designing applications with EJB 3.0
■
725
■
Implementing a conversation with stateful beans 725 Writing DAOs with EJBs 727 Utilizing dependency injection 728
16.5
Testing
730
■ ■
Understanding different kinds of tests 731 Introducing TestNG 732 Testing the persistence layer 736 Considering performance benchmarks 744
16.6
Summary
746
17
Introducing JBoss Seam 747
17.1 The Java EE 5.0 programming model 748
Considering JavaServer Faces 749 Considering EJB 3.0 Writing a web application with JSF and EJB 3.0 752 Analyzing the application 762
■
751
17.2
Improving the application with Seam
Configuring Seam components 767
■ ■
765
766 Binding pages to stateful Seam Analyzing the Seam application 773
17.3
Understanding contextual components
■ ■
779
Writing the login page 779 Creating the components 781 Aliasing contextual variables 784 Completing the login/logout feature 786
17.4
Validating user input 789
Introducing Hibernate Validator 790 Creating the registration page 791 Internationalization with Seam 799
■ ■
CONTENTS
xvii
17.5
Simplifying persistence with Seam
Implementing a conversation 804 persistence context 811
■
803
Letting Seam manage the
17.6
Summary
816
appendix A SQL fundamentals 818 appendix B Mapping quick reference 822 references 824 index 825
foreword to the revised edition
When Hibernate in Action was published two years ago, it was immediately recognized not only as the definitive book on Hibernate, but also as the definitive work on object/relational mapping. In the intervening time, the persistence landscape has changed with the release of the Java Persistence API, the new standard for object/relational mapping for Java EE and Java SE which was developed under the Java Community Process as part of the Enterprise JavaBeans 3.0 Specification. In developing the Java Persistence API, the EJB 3.0 Expert Group benefitted heavily from the experience of the O/R mapping frameworks already in use in the Java community. As one of the leaders among these, Hibernate has had a very significant influence on the technical direction of Java Persistence. This was due not only to the participation of Gavin King and other members of the Hibernate team in the EJB 3.0 standardization effort, but was also due in large part to the direct and pragmatic approach that Hibernate has taken towards O/R mapping and to the simplicity, clarity, and power of its APIs--and their resulting appeal to the Java community. In addition to their contributions to Java Persistence, the Hibernate developers also have taken major steps forward for Hibernate with the Hibernate 3 release described in this book. Among these are support for operations over large datasets; additional and more sophisticated mapping options, especially for handling legacy databases; data filters; strategies for managing conversations; and
xix
xx
FOREWORD TO THE REVISED EDITION
integration with Seam, the new framework for web application development with JSF and EJB 3.0. Java Persistence with Hibernate is therefore considerably more than simply a second edition to Hibernate in Action. It provides a comprehensive overview of all the capabilities of the Java Persistence API in addition to those of Hibernate 3, as well as a detailed comparative analysis of the two. It describes how Hibernate has been used to implement the Java Persistence standard, and how to leverage the Hibernate extensions to Java Persistence. More important, throughout the presentation of Hibernate and Java Persistence, Christian Bauer and Gavin King illustrate and explain the fundamental principles and decisions that need to be taken into account in both the design and use of an object/relational mapping framework. The insights they provide into the underlying issues of ORM give the reader a deep understanding into the effective application of ORM as an enterprise technology. Java Persistence with Hibernate thus reaches out to a wide range of developers— from newcomers to object/relational mapping to experienced developers—seeking to learn more about cutting-edge technological innovations in the Java community that have occurred and are continuing to emerge as a result of this work.
LINDA DEMICHIEL Specification Lead Enterprise JavaBeans 3.0 and Java Persistence Sun Microsystems
foreword to the first edition
Relational databases are indisputably at the core of the modern enterprise. While modern programming languages, including JavaTM, provide an intuitive, object-oriented view of application-level business entities, the enterprise data underlying these entities is heavily relational in nature. Further, the main strength of the relational model—over earlier navigational models as well as over later OODB models—is that by design it is intrinsically agnostic to the programmatic manipulation and application-level view of the data that it serves up. Many attempts have been made to bridge relational and object-oriented technologies, or to replace one with the other, but the gap between the two is one of the hard facts of enterprise computing today. It is this challenge—to provide a bridge between relational data and JavaTM objects—that Hibernate takes on through its object/relational mapping (ORM) approach. Hibernate meets this challenge in a very pragmatic, direct, and realistic way. As Christian Bauer and Gavin King demonstrate in this book, the effective use of ORM technology in all but the simplest of enterprise environments requires understanding and configuring how the mediation between relational data and objects is performed. This demands that the developer be aware and knowledgeable both of the application and its data requirements, and of the SQL query language, relational storage structures, and the potential for optimization that relational technology offers.
xxi
xxii
FOREWORD TO THE FIRST EDITION
Not only does Hibernate provide a full-function solution that meets these requirements head on, it is also a flexible and configurable architecture. Hibernate’s developers designed it with modularity, pluggability, extensibility, and user customization in mind. As a result, in the few years since its initial release, Hibernate has rapidly become one of the leading ORM technologies for enterprise developers—and deservedly so. This book provides a comprehensive overview of Hibernate. It covers how to use its type mapping capabilities and facilities for modeling associations and inheritance; how to retrieve objects efficiently using the Hibernate query language; how to configure Hibernate for use in both managed and unmanaged environments; and how to use its tools. In addition, throughout the book the authors provide insight into the underlying issues of ORM and into the design choices behind Hibernate. These insights give the reader a deep understanding of the effective use of ORM as an enterprise technology. Hibernate in Action is the definitive guide to using Hibernate and to object/relational mapping in enterprise computing today. LINDA DEMICHIEL
Lead Architect, Enterprise JavaBeans Sun Microsystems
preface to the revised edition
The predecessor of this book, Hibernate in Action, started with a quote from Anthony Berglas: “Just because it is possible to push twigs along the ground with one’s nose does not necessarily mean that that is the best way to collect firewood.” Since then, the Hibernate project and the strategies and concepts software developers rely on to manage information have evolved. However, the fundamental issues are still the same—every company we work with every day still uses SQL databases, and Java is entrenched in the industry as the first choice for enterprise application development. The tabular representation of data in a relational system is still fundamentally different than the networks of objects used in object-oriented Java applications. We still see the object/relational impedance mismatch, and we frequently see that the importance and cost of this mismatch is underestimated. On the other hand, we now have a range of tools and solutions available to deal with this problem. We’re done collecting firewood, and the pocket lighter has been replaced with a flame thrower. Hibernate is now available in its third major release; Hibernate 3.2 is the version we describe in this book. Compared to older Hibernate versions, this new major release has twice as many features—and this book is almost double the size of Hibernate in Action. Most of these features are ones that you, the developers working with Hibernate every day, have asked for. We’ve sometimes said that Hibernate is a 90 percent solution for all the problems a Java application devel-
xxiii
xxiv
PREFACE TO THE REVISED EDITION
oper has to deal with when creating a database application. With the latest Hibernate version, this number is more likely 99 percent. As Hibernate matured and its user base and community kept growing, the Java standards for data management and database application development were found lacking by many developers. We even told you not to use EJB 2.x entity beans in Hibernate in Action. Enter EJB 3.0 and the new Java Persistence standard. This new industry standard is a major step forward for the Java developer community. It defines a lightweight and simplified programming model and powerful object/relational persistence. Many of the key concepts of the new standard were modeled after Hibernate and other successful object/relational persistence solutions. The latest Hibernate version implements the Java Persistence standard. So, in addition to the new all-in-one Hibernate for every purpose, you can now use Hibernate like any Java Persistence provider, with or without other EJB 3.0 components and Java EE 5.0 services. This deep integration of Hibernate with such a rich programming model enables you to design and implement application functionality that was difficult to create by hand before. We wrote this book to give you a complete and accurate guide to both Hibernate and Java Persistence (and also all relevant EJB 3.0 concepts). We hope that you’ll enjoy learning Hibernate and that you'll keep this reference bible on your desk for your daily work.
preface to the first edition
Just because it is possible to push twigs along the ground with one’s nose does not necessarily mean that that is the best way to collect firewood. —Anthony Berglas Today, many software developers work with Enterprise Information Systems (EIS). This kind of application creates, manages, and stores structured information and shares this information between many users in multiple physical locations. The storage of EIS data involves massive usage of SQL-based database management systems. Every company we’ve met during our careers uses at least one SQL database; most are completely dependent on relational database technology at the core of their business. In the past five years, broad adoption of the Java programming language has brought about the ascendancy of the object-oriented paradigm for software development. Developers are now sold on the benefits of object orientation. However, the vast majority of businesses are also tied to long-term investments in expensive relational database systems. Not only are particular vendor products entrenched, but existing legacy data must be made available to (and via) the shiny new objectoriented web applications. However, the tabular representation of data in a relational system is fundamentally different than the networks of objects used in object-oriented Java applications. This difference has led to the so-called object/relational paradigm mismatch.
xxv
xxvi
PREFACE TO THE FIRST EDITION
Traditionally, the importance and cost of this mismatch have been underestimated, and tools for solving the mismatch have been insufficient. Meanwhile, Java developers blame relational technology for the mismatch; data professionals blame object technology. Object/relational mapping (ORM) is the name given to automated solutions to the mismatch problem. For developers weary of tedious data access code, the good news is that ORM has come of age. Applications built with ORM middleware can be expected to be cheaper, more performant, less vendor-specific, and more able to cope with changes to the internal object or underlying SQL schema. The astonishing thing is that these benefits are now available to Java developers for free. Gavin King began developing Hibernate in late 2001 when he found that the popular persistence solution at the time—CMP Entity Beans—didn’t scale to nontrivial applications with complex data models. Hibernate began life as an independent, noncommercial open source project. The Hibernate team (including the authors) has learned ORM the hard way— that is, by listening to user requests and implementing what was needed to satisfy those requests. The result, Hibernate, is a practical solution, emphasizing developer productivity and technical leadership. Hibernate has been used by tens of thousands of users and in many thousands of production applications. When the demands on their time became overwhelming, the Hibernate team concluded that the future success of the project (and Gavin’s continued sanity) demanded professional developers dedicated full-time to Hibernate. Hibernate joined jboss.org in late 2003 and now has a commercial aspect; you can purchase commercial support and training from JBoss Inc. But commercial training shouldn’t be the only way to learn about Hibernate. It’s obvious that many, perhaps even most, Java projects benefit from the use of an ORM solution like Hibernate—although this wasn’t obvious a couple of years ago! As ORM technology becomes increasingly mainstream, product documentation such as Hibernate’s free user manual is no longer sufficient. We realized that the Hibernate community and new Hibernate users needed a full-length book, not only to learn about developing software with Hibernate, but also to understand and appreciate the object/relational mismatch and the motivations behind Hibernate’s design.
PREFACE TO THE FIRST EDITION
xxvii
The book you’re holding was an enormous effort that occupied most of our spare time for more than a year. It was also the source of many heated disputes and learning experiences. We hope this book is an excellent guide to Hibernate (or, “the Hibernate bible,” as one of our reviewers put it) and also the first comprehensive documentation of the object/relational mismatch and ORM in general. We hope you find it helpful and enjoy working with Hibernate.
acknowledgments
This book grew from a small second edition of Hibernate in Action into a volume of considerable size. We couldn’t have created it without the help of many people. Emmanuel Bernard did an excellent job as the technical reviewer of this book; thank you for the many hours you spent editing our broken code examples. We’d also like to thank our other reviewers: Patrick Dennis, Jon Skeet, Awais Bajwa, Dan Dobrin, Deiveehan Nallazhagappan, Ryan Daigle, Stuart Caborn, Patrick Peak, TVS Murthy, Bill Fly, David Walend, Dave Dribin, Anjan Bacchu, Gary Udstrand, and Srinivas Nallapati. Special thanks to Linda DiMichiel for agreeing to write the foreword to our book, as she did to the first edition Marjan Bace again assembled a great production team at Manning: Sydney Jones edited our crude manuscript and turned it into a real book. Tiffany Taylor, Elizabeth Martin, and Andy Carroll found all our typos and made the book readable. Dottie Marsico was responsible for typesetting and gave this book its great look. Mary Piergies coordinated and organized the production process. We’d like to thank you all for working with us.
xxviii
about this book
We had three goals when writing this book, so you can read it as
■
A tutorial for Hibernate, Java Persistence, and EJB 3.0 that guides you through your first steps with these solutions A guide for learning all basic and advanced Hibernate features for object/ relational mapping, object processing, querying, performance optimization, and application design A reference for whenever you need a complete and technically accurate definition of Hibernate and Java Persistence functionality
■
■
Usually, books are either tutorials or reference guides, so this stretch comes at a price. If you’re new to Hibernate, we suggest that you start reading the book from the start, with the tutorials in chapters 1 and 2. If you have used an older version of Hibernate, you should read the first two chapters quickly to get an overview and then jump into the middle with chapter 3. We will, whenever appropriate, tell you if a particular section or subject is optional or reference material that you can safely skip during your first read.
Roadmap
This book is divided into three major parts. In part 1, we introduce the object/relational paradigm mismatch and explain the fundamentals behind object/relational mapping. We walk through a hands-
xxix
xxx
ABOUT THIS BOOK
on tutorial to get you started with your first Hibernate, Java Persistence, or EJB 3.0 project. We look at Java application design for domain models and at the options for creating object/relational mapping metadata. Mapping Java classes and properties to SQL tables and columns is the focus of part 2. We explore all basic and advanced mapping options in Hibernate and Java Persistence, with XML mapping files and Java annotations. We show you how to deal with inheritance, collections, and complex class associations. Finally, we discuss integration with legacy database schemas and some mapping strategies that are especially tricky. Part 3 is all about the processing of objects and how you can load and store data with Hibernate and Java Persistence. We introduce the programming interfaces, how to write transactional and conversation-aware applications, and how to write queries. Later, we focus on the correct design and implementation of layered Java applications. We discuss the most common design patterns that are used with Hibernate, such as the Data Access Object (DAO) and EJB Command patterns. You’ll see how you can test your Hibernate application easily and what other best practices are relevant if you work an object/relational mapping software. Finally, we introduce the JBoss Seam framework, which takes many Hibernate concepts to the next level and enables you to create conversational web applications with ease. We promise you’ll find this chapter interesting, even if you don’t plan to use Seam.
Who should read this book?
Readers of this book should have basic knowledge of object-oriented software development and should have used this knowledge in practice. To understand the application examples, you should be familiar with the Java programming language and the Unified Modeling Language. Our primary target audience consists of Java developers who work with SQLbased database systems. We’ll show you how to substantially increase your productivity by leveraging ORM. If you’re a database developer, the book can be part of your introduction to object-oriented software development. If you’re a database administrator, you’ll be interested in how ORM affects performance and how you can tune the performance of the SQL database-management system and persistence layer to achieve performance targets. Because data
ABOUT THIS BOOK
xxxi
access is the bottleneck in most Java applications, this book pays close attention to performance issues. Many DBAs are understandably nervous about entrusting performance to tool-generated SQL code; we seek to allay those fears and also to highlight cases where applications shouldn’t use tool-managed data access. You may be relieved to discover that we don’t claim that ORM is the best solution to every problem.
Code conventions
This book provides copious examples, which include all the Hibernate application artifacts: Java code, Hibernate configuration files, and XML mapping metadata files. Source code in listings or in text is in a fixed-width font like this to separate it from ordinary text. Additionally, Java method names, component parameters, object properties, and XML elements and attributes in text are also presented using fixed-width font. Java, HTML, and XML can all be verbose. In many cases, the original source code (available online) has been reformatted; we’ve added line breaks and reworked indentation to accommodate the available page space in the book. In rare cases, even this was not enough, and listings include line-continuation markers. Additionally, comments in the source code have often been removed from the listings when the code is described in the text. Code annotations accompany some of the source code listings, highlighting important concepts. In some cases, numbered bullets link to explanations that follow the listing.
Source code downloads
Hibernate is an open source project released under the Lesser GNU Public License. Directions for downloading Hibernate packages, in source or binary form, are available from the Hibernate web site: www.hibernate.org/. The source code for all Hello World and CaveatEmptor examples in this book is available from http://caveatemptor.hibernate.org/ under a free (BSD-like) license. The CaveatEmptor example application code is available on this web site in different flavors—for example, with a focus on native Hibernate, on Java Persistence, and on JBoss Seam. You can also download the code for the examples in this book from the publisher’s website, www.manning.com/bauer2.
xxxii
ABOUT THIS BOOK
About the authors
Christian Bauer is a member of the Hibernate developer team. He works as a trainer, consultant, and product manager for Hibernate, EJB 3.0, and JBoss Seam at JBoss, a division of Red Hat. With Gavin King, Christian wrote Hibernate in Action. Gavin King is the founder of the Hibernate and JBoss Seam projects, and a member of the EJB 3.0 (JSR 220) expert group. He also leads the Web Beans JSR 299, a standardization effort involving Hibernate concepts, JBoss Seam, JSF, and EJB 3.0. Gavin works as a lead developer at JBoss, a division of Red Hat.
Author Online
Your purchase of Java Persistence with Hibernate includes free access to a private web forum run by Manning Publications, where you can make comments about the book, ask technical questions, and receive help from the authors and from other users. To access the forum and subscribe to it, point your web browser to www.manning.com/bauer2. This page provides information on how to get onto the forum once you are registered, what kind of help is available, and the rules of conduct on the forum. Manning’s commitment to our readers is to provide a venue where a meaningful dialogue among individual readers and between readers and the authors can take place. It is not a commitment to any specific amount of participation on the part of the author, whose contribution to the AO remains voluntary (and unpaid). We suggest you try asking the authors some challenging questions, lest their interest stray! The Author Online forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.
about the cover illustration
The illustration on the cover of Java Persistence with Hibernate is taken from a collection of costumes of the Ottoman Empire published on January 1, 1802, by William Miller of Old Bond Street, London. The title page is missing from the collection and we have been unable to track it down to date. The book’s table of contents identifies the figures in both English and French, and each illustration bears the names of two artists who worked on it, both of whom would no doubt be surprised to find their art gracing the front cover of a computer programming book…two hundred years later. The collection was purchased by a Manning editor at an antiquarian flea market in the “Garage” on West 26th Street in Manhattan. The seller was an American based in Ankara, Turkey, and the transaction took place just as he was packing up his stand for the day. The Manning editor did not have on his person the substantial amount of cash that was required for the purchase and a credit card and check were both politely turned down. With the seller flying back to Ankara that evening the situation was getting hopeless. What was the solution? It turned out to be nothing more than an old-fashioned verbal agreement sealed with a handshake. The seller simply proposed that the money be transferred to him by wire and the editor walked out with the bank information on a piece of paper and the portfolio of images under his arm. Needless to say, we transferred the funds the next day, and we remain grateful and impressed by this unknown person’s trust in one of us. It recalls something that might have happened a long time ago.
xxxiii
xxxiv
ABOUT THE COVER ILLUSTRATION
The pictures from the Ottoman collection, like the other illustrations that appear on our covers, bring to life the richness and variety of dress customs of two centuries ago. They recall the sense of isolation and distance of that period—and of every other historic period except our own hyperkinetic present. Dress codes have changed since then and the diversity by region, so rich at the time, has faded away. It is now often hard to tell the inhabitant of one continent from another. Perhaps, trying to view it optimistically, we have traded a cultural and visual diversity for a more varied personal life. Or a more varied and interesting intellectual and technical life. We at Manning celebrate the inventiveness, the initiative, and, yes, the fun of the computer business with book covers based on the rich diversity of regional life of two centuries ago‚ brought back to life by the pictures from this collection.
Part 1 Getting started with Hibernate and EJB 3.0
n part 1, we show you why object persistence is such a complex topic and what solutions you can apply in practice. Chapter 1 introduces the object/relational paradigm mismatch and several strategies to deal with it, foremost object/relational mapping (ORM). In chapter 2, we guide you step by step through a tutorial with Hibernate, Java Persistence, and EJB 3.0—you’ll implement and test a “Hello World” example in all variations. Thus prepared, in chapter 3 you’re ready to learn how to design and implement complex business domain models in Java, and which mapping metadata options you have available. After reading this part of the book, you’ll understand why you need object/ relational mapping, and how Hibernate, Java Persistence, and EJB 3.0 work in practice. You’ll have written your first small project, and you’ll be ready to take on more complex problems. You’ll also understand how real-world business entities can be implemented as a Java domain model, and in what format you prefer to work with object/relational mapping metadata.
I
Understanding object/relational persistence
This chapter covers
■ ■ ■
Object persistence with SQL databases The object/relational paradigm mismatch Persistence layers in object-oriented applications Object/relational mapping background
■
3
4
CHAPTER 1
Understanding object/relational persistence
The approach to managing persistent data has been a key design decision in every software project we’ve worked on. Given that persistent data isn’t a new or unusual requirement for Java applications, you’d expect to be able to make a simple choice among similar, well-established persistence solutions. Think of web application frameworks (Struts versus WebWork), GUI component frameworks (Swing versus SWT), or template engines (JSP versus Velocity). Each of the competing solutions has various advantages and disadvantages, but they all share the same scope and overall approach. Unfortunately, this isn’t yet the case with persistence technologies, where we see some wildly differing solutions to the same problem. For several years, persistence has been a hot topic of debate in the Java community. Many developers don’t even agree on the scope of the problem. Is persistence a problem that is already solved by relational technology and extensions such as stored procedures, or is it a more pervasive problem that must be addressed by special Java component models, such as EJB entity beans? Should we hand-code even the most primitive CRUD (create, read, update, delete) operations in SQL and JDBC, or should this work be automated? How do we achieve portability if every database management system has its own SQL dialect? Should we abandon SQL completely and adopt a different database technology, such as object database systems? Debate continues, but a solution called object/relational mapping (ORM) now has wide acceptance. Hibernate is an open source ORM service implementation. Hibernate is an ambitious project that aims to be a complete solution to the problem of managing persistent data in Java. It mediates the application’s interaction with a relational database, leaving the developer free to concentrate on the business problem at hand. Hibernate is a nonintrusive solution. You aren’t required to follow many Hibernate-specific rules and design patterns when writing your business logic and persistent classes; thus, Hibernate integrates smoothly with most new and existing applications and doesn’t require disruptive changes to the rest of the application. This book is about Hibernate. We’ll cover basic and advanced features and describe some ways to develop new applications using Hibernate. Often, these recommendations won’t even be specific to Hibernate. Sometimes they will be our ideas about the best ways to do things when working with persistent data, explained in the context of Hibernate. This book is also about Java Persistence, a new standard for persistence that is part of the also updated EJB 3.0 specification. Hibernate implements Java Persistence and supports all the standardized mappings, queries, and APIs. Before we can get started with Hibernate, however, you need to understand the core problems of object persistence and object/relational
What is persistence?
5
mapping. This chapter explains why tools like Hibernate and specifications such as Java Persistence and EJB 3.0 are needed. First, we define persistent data management in the context of object-oriented applications and discuss the relationship of SQL, JDBC, and Java, the underlying technologies and standards that Hibernate is built on. We then discuss the socalled object/relational paradigm mismatch and the generic problems we encounter in object-oriented software development with relational databases. These problems make it clear that we need tools and patterns to minimize the time we have to spend on the persistence-related code of our applications. After we look at alternative tools and persistence mechanisms, you’ll see that ORM is the best available solution for many scenarios. Our discussion of the advantages and drawbacks of ORM will give you the full background to make the best decision when picking a persistence solution for your own project. We also take a look at the various Hibernate software modules, and how you can combine them to either work with Hibernate only, or with Java Persistence and EJB 3.0-compliant features. The best way to learn Hibernate isn’t necessarily linear. We understand that you may want to try Hibernate right away. If this is how you’d like to proceed, skip to the second chapter of this book and have a look at the “Hello World” example and set up a project. We recommend that you return here at some point as you circle through the book. That way, you’ll be prepared and have all the background concepts you need for the rest of the material.
1.1
What is persistence?
Almost all applications require persistent data. Persistence is one of the fundamental concepts in application development. If an information system didn’t preserve data when it was powered off, the system would be of little practical use. When we talk about persistence in Java, we’re normally talking about storing data in a relational database using SQL. We’ll start by taking a brief look at the technology and how we use it with Java. Armed with that information, we’ll then continue our discussion of persistence and how it’s implemented in object-oriented applications.
1.1.1
Relational databases
You, like most other developers, have probably worked with a relational database. Most of us use a relational database every day. Relational technology is a known quantity, and this alone is sufficient reason for many organizations to choose it.
6
CHAPTER 1
Understanding object/relational persistence
But to say only this is to pay less respect than is due. Relational databases are entrenched because they’re an incredibly flexible and robust approach to data management. Due to the complete and consistent theoretical foundation of the relational data model, relational databases can effectively guarantee and protect the integrity of the data, among other desirable characteristics. Some people would even say that the last big invention in computing has been the relational concept for data management as first introduced by E.F. Codd (Codd, 1970) more than three decades ago. Relational database management systems aren’t specific to Java, nor is a relational database specific to a particular application. This important principle is known as data independence. In other words, and we can’t stress this important fact enough, data lives longer than any application does. Relational technology provides a way of sharing data among different applications, or among different technologies that form parts of the same application (the transactional engine and the reporting engine, for example). Relational technology is a common denominator of many disparate systems and technology platforms. Hence, the relational data model is often the common enterprise-wide representation of business entities. Relational database management systems have SQL-based application programming interfaces; hence, we call today’s relational database products SQL database management systems or, when we’re talking about particular systems, SQL databases. Before we go into more detail about the practical aspects of SQL databases, we have to mention an important issue: Although marketed as relational, a database system providing only an SQL data language interface isn’t really relational and in many ways isn’t even close to the original concept. Naturally, this has led to confusion. SQL practitioners blame the relational data model for shortcomings in the SQL language, and relational data management experts blame the SQL standard for being a weak implementation of the relational model and ideals. Application developers are stuck somewhere in the middle, with the burden to deliver something that works. We’ll highlight some important and significant aspects of this issue throughout the book, but generally we’ll focus on the practical aspects. If you’re interested in more background material, we highly recommend Practical Issues in Database Management: A Reference for the Thinking Practitioner by Fabian Pascal (Pascal, 2000).
1.1.2
Understanding SQL
To use Hibernate effectively, a solid understanding of the relational model and SQL is a prerequisite. You need to understand the relational model and topics such as normalization to guarantee the integrity of your data, and you’ll need to
What is persistence?
7
use your knowledge of SQL to tune the performance of your Hibernate application. Hibernate automates many repetitive coding tasks, but your knowledge of persistence technology must extend beyond Hibernate itself if you want to take advantage of the full power of modern SQL databases. Remember that the underlying goal is robust, efficient management of persistent data. Let’s review some of the SQL terms used in this book. You use SQL as a data definition language (DDL) to create a database schema with CREATE and ALTER statements. After creating tables (and indexes, sequences, and so on), you use SQL as a data manipulation language (DML) to manipulate and retrieve data. The manipulation operations include insertions, updates, and deletions. You retrieve data by executing queries with restrictions, projections, and join operations (including the Cartesian product). For efficient reporting, you use SQL to group, order, and aggregate data as necessary. You can even nest SQL statements inside each other; this technique uses subselects. You’ve probably used SQL for many years and are familiar with the basic operations and statements written in this language. Still, we know from our own experience that SQL is sometimes hard to remember, and some terms vary in usage. To understand this book, we must use the same terms and concepts, so we advise you to read appendix A if any of the terms we’ve mentioned are new or unclear. If you need more details, especially about any performance aspects and how SQL is executed, get a copy of the excellent book SQL Tuning by Dan Tow (Tow, 2003). Also read An Introduction to Database Systems by Chris Date (Date, 2003) for the theory, concepts, and ideals of (relational) database systems. The latter book is an excellent reference (it’s big) for all questions you may possibly have about databases and data management. Although the relational database is one part of ORM, the other part, of course, consists of the objects in your Java application that need to be persisted to and loaded from the database using SQL.
1.1.3
Using SQL in Java
When you work with an SQL database in a Java application, the Java code issues SQL statements to the database via the Java Database Connectivity (JDBC) API. Whether the SQL was written by hand and embedded in the Java code, or generated on the fly by Java code, you use the JDBC API to bind arguments to prepare query parameters, execute the query, scroll through the query result table, retrieve values from the result set, and so on. These are low-level data access tasks; as application developers, we’re more interested in the business problem that requires this data access. What we’d really like to write is code that saves and
8
CHAPTER 1
Understanding object/relational persistence
retrieves objects—the instances of our classes—to and from the database, relieving us of this low-level drudgery. Because the data access tasks are often so tedious, we have to ask: Are the relational data model and (especially) SQL the right choices for persistence in objectoriented applications? We answer this question immediately: Yes! There are many reasons why SQL databases dominate the computing industry—relational database management systems are the only proven data management technology, and they’re almost always a requirement in any Java project. However, for the last 15 years, developers have spoken of a paradigm mismatch. This mismatch explains why so much effort is expended on persistence-related concerns in every enterprise project. The paradigms referred to are object modeling and relational modeling, or perhaps object-oriented programming and SQL. Let’s begin our exploration of the mismatch problem by asking what persistence means in the context of object-oriented application development. First we’ll widen the simplistic definition of persistence stated at the beginning of this section to a broader, more mature understanding of what is involved in maintaining and using persistent data.
1.1.4
Persistence in object-oriented applications
In an object-oriented application, persistence allows an object to outlive the process that created it. The state of the object can be stored to disk, and an object with the same state can be re-created at some point in the future. This isn’t limited to single objects—entire networks of interconnected objects can be made persistent and later re-created in a new process. Most objects aren’t persistent; a transient object has a limited lifetime that is bounded by the life of the process that instantiated it. Almost all Java applications contain a mix of persistent and transient objects; hence, we need a subsystem that manages our persistent data. Modern relational databases provide a structured representation of persistent data, enabling the manipulating, sorting, searching, and aggregating of data. Database management systems are responsible for managing concurrency and data integrity; they’re responsible for sharing data between multiple users and multiple applications. They guarantee the integrity of the data through integrity rules that have been implemented with constraints. A database management system provides data-level security. When we discuss persistence in this book, we’re thinking of all these things:
What is persistence?
9
■ ■ ■
Storage, organization, and retrieval of structured data Concurrency and data integrity Data sharing
And, in particular, we’re thinking of these problems in the context of an objectoriented application that uses a domain model. An application with a domain model doesn’t work directly with the tabular representation of the business entities; the application has its own object-oriented model of the business entities. If the database of an online auction system has ITEM and BID tables, for example, the Java application defines Item and Bid classes. Then, instead of directly working with the rows and columns of an SQL result set, the business logic interacts with this object-oriented domain model and its runtime realization as a network of interconnected objects. Each instance of a Bid has a reference to an auction Item, and each Item may have a collection of references to Bid instances. The business logic isn’t executed in the database (as an SQL stored procedure); it’s implemented in Java in the application tier. This allows business logic to make use of sophisticated object-oriented concepts such as inheritance and polymorphism. For example, we could use well-known design patterns such as Strategy, Mediator, and Composite (Gamma and others, 1995), all of which depend on polymorphic method calls. Now a caveat: Not all Java applications are designed this way, nor should they be. Simple applications may be much better off without a domain model. Complex applications may have to reuse existing stored procedures. SQL and the JDBC API are perfectly serviceable for dealing with pure tabular data, and the JDBC RowSet makes CRUD operations even easier. Working with a tabular representation of persistent data is straightforward and well understood. However, in the case of applications with nontrivial business logic, the domain model approach helps to improve code reuse and maintainability significantly. In practice, both strategies are common and needed. Many applications need to execute procedures that modify large sets of data, close to the data. At the same time, other application modules could benefit from an object-oriented domain model that executes regular online transaction processing logic in the application tier. An efficient way to bring persistent data closer to the application code is required. If we consider SQL and relational databases again, we finally observe the mismatch between the two paradigms. SQL operations such as projection and join always result in a tabular representation of the resulting data. (This is known as
10
CHAPTER 1
Understanding object/relational persistence
transitive closure; the result of an operation on relations is always a relation.) This is quite different from the network of interconnected objects used to execute the business logic in a Java application. These are fundamentally different models, not just different ways of visualizing the same model. With this realization, you can begin to see the problems—some well understood and some less well understood—that must be solved by an application that combines both data representations: an object-oriented domain model and a persistent relational model. Let’s take a closer look at this so-called paradigm mismatch.
1.2
The paradigm mismatch
The object/relational paradigm mismatch can be broken into several parts, which we’ll examine one at a time. Let’s start our exploration with a simple example that is problem free. As we build on it, you’ll begin to see the mismatch appear. Suppose you have to design and implement an online e-commerce application. In this application, you need a class to represent information about a user of the system, and another class to represent information about the user’s billing details, as shown in figure 1.1. In this diagram, you can see that a User has many BillingDetails. You can navigate the relationship between the classes in both directions. The classes representing these entities may be extremely simple:
public class User { private String username; private String name; private String address; private Set billingDetails; // Accessor methods (getter/setter), business methods, etc. ... } public class BillingDetails { private String accountNumber; private String accountName; private String accountType; private User user; // Accessor methods (getter/setter), business methods, etc. ... } Figure 1.1 A simple UML class diagram of the User and BillingDetails entities
The paradigm mismatch
11
Note that we’re only interested in the state of the entities with regard to persistence, so we’ve omitted the implementation of property accessors and business methods (such as getUsername() or billAuction()). It’s easy to come up with a good SQL schema design for this case:
create table USERS ( USERNAME varchar(15) not null primary key, NAME varchar(50) not null, ADDRESS varchar(100) ) create table BILLING_DETAILS ( ACCOUNT_NUMBER varchar(10) not null primary key, ACCOUNT_NAME varchar(50) not null, ACCOUNT_TYPE varchar(2) not null, USERNAME varchar(15) foreign key references user )
The relationship between the two entities is represented as the foreign key, USERNAME, in BILLING_DETAILS. For this simple domain model, the object/relational mismatch is barely in evidence; it’s straightforward to write JDBC code to insert, update, and delete information about users and billing details. Now, let’s see what happens when we consider something a little more realistic. The paradigm mismatch will be visible when we add more entities and entity relationships to our application. The most glaringly obvious problem with our current implementation is that we’ve designed an address as a simple String value. In most systems, it’s necessary to store street, city, state, country, and ZIP code information separately. Of course, we could add these properties directly to the User class, but because it’s highly likely that other classes in the system will also carry address information, it makes more sense to create a separate Address class. The updated model is shown in figure 1.2. Should we also add an ADDRESS table? Not necessarily. It’s common to keep address information in the USERS table, in individual columns. This design is likely to perform better, because a table join isn’t needed if you want to retrieve the user and address in a single query. The nicest solution may even be to create a user-defined SQL datatype to represent addresses, and to use a single column of that new type in the USERS table instead of several new columns. Basically, we have the choice of adding either several columns or a single column (of a new SQL datatype). This is clearly a problem of granularity.
Figure 1.2 The User has an Address
12
CHAPTER 1
Understanding object/relational persistence
1.2.1
The problem of granularity
Granularity refers to the relative size of the types you’re working with. Let’s return to our example. Adding a new datatype to our database catalog, to store Address Java instances in a single column, sounds like the best approach. A new Address type (class) in Java and a new ADDRESS SQL datatype should guarantee interoperability. However, you’ll find various problems if you check the support for user-defined datatypes (UDT) in today’s SQL database management systems. UDT support is one of a number of so-called object-relational extensions to traditional SQL. This term alone is confusing, because it means that the database management system has (or is supposed to support) a sophisticated datatype system— something you take for granted if somebody sells you a system that can handle data in a relational fashion. Unfortunately, UDT support is a somewhat obscure feature of most SQL database management systems and certainly isn’t portable between different systems. Furthermore, the SQL standard supports user-defined datatypes, but poorly. This limitation isn’t the fault of the relational data model. You can consider the failure to standardize such an important piece of functionality as fallout from the object-relational database wars between vendors in the mid-1990s. Today, most developers accept that SQL products have limited type systems—no questions asked. However, even with a sophisticated UDT system in our SQL database management system, we would likely still duplicate the type declarations, writing the new type in Java and again in SQL. Attempts to find a solution for the Java space, such as SQLJ, unfortunately, have not had much success. For these and whatever other reasons, use of UDTs or Java types inside an SQL database isn’t common practice in the industry at this time, and it’s unlikely that you’ll encounter a legacy schema that makes extensive use of UDTs. We therefore can’t and won’t store instances of our new Address class in a single new column that has the same datatype as the Java layer. Our pragmatic solution for this problem has several columns of built-in vendor-defined SQL types (such as boolean, numeric, and string datatypes). The USERS table is usually defined as follows:
create table USERS ( USERNAME varchar(15) not null primary key, NAME varchar(50) not null, ADDRESS_STREET varchar(50), ADDRESS_CITY varchar(15), ADDRESS_STATE varchar(15),
The paradigm mismatch
13
ADDRESS_ZIPCODE varchar(5), ADDRESS_COUNTRY varchar(15) )
Classes in our domain model come in a range of different levels of granularity— from coarse-grained entity classes like User, to finer-grained classes like Address, down to simple String-valued properties such as zipcode. In contrast, just two levels of granularity are visible at the level of the SQL database: tables such as USERS, and columns such as ADDRESS_ZIPCODE. Many simple persistence mechanisms fail to recognize this mismatch and so end up forcing the less flexible SQL representation upon the object model. We’ve seen countless User classes with properties named zipcode! It turns out that the granularity problem isn’t especially difficult to solve. We probably wouldn’t even discuss it, were it not for the fact that it’s visible in so many existing systems. We describe the solution to this problem in chapter 4, section 4.4, “Fine-grained models and mappings.” A much more difficult and interesting problem arises when we consider domain models that rely on inheritance, a feature of object-oriented design we may use to bill the users of our e-commerce application in new and interesting ways.
1.2.2
The problem of subtypes
In Java, you implement type inheritance using superclasses and subclasses. To illustrate why this can present a mismatch problem, let’s add to our e-commerce application so that we now can accept not only bank account billing, but also credit and debit cards. The most natural way to reflect this change in the model is to use inheritance for the BillingDetails class. We may have an abstract BillingDetails superclass, along with several concrete subclasses: CreditCard, BankAccount, and so on. Each of these subclasses defines slightly different data (and completely different functionality that acts on that data). The UML class diagram in figure 1.3 illustrates this model. SQL should probably include standard support for supertables and subtables. This would effectively allow us to create a table that inherits certain columns from
Figure 1.3 Using inheritance for different billing strategies
14
CHAPTER 1
Understanding object/relational persistence
its parent. However, such a feature would be questionable, because it would introduce a new notion: virtual columns in base tables. Traditionally, we expect virtual columns only in virtual tables, which are called views. Furthermore, on a theoretical level, the inheritance we applied in Java is type inheritance. A table isn’t a type, so the notion of supertables and subtables is questionable. In any case, we can take the short route here and observe that SQL database products don’t generally implement type or table inheritance, and if they do implement it, they don’t follow a standard syntax and usually expose you to data integrity problems (limited integrity rules for updatable views). In chapter 5, section 5.1, “Mapping class inheritance,” we discuss how ORM solutions such as Hibernate solve the problem of persisting a class hierarchy to a database table or tables. This problem is now well understood in the community, and most solutions support approximately the same functionality. But we aren’t finished with inheritance. As soon as we introduce inheritance into the model, we have the possibility of polymorphism. The User class has an association to the BillingDetails superclass. This is a polymorphic association. At runtime, a User object may reference an instance of any of the subclasses of BillingDetails. Similarly, we want to be able to write polymorphic queries that refer to the BillingDetails class, and have the query return instances of its subclasses. SQL databases also lack an obvious way (or at least a standardized way) to represent a polymorphic association. A foreign key constraint refers to exactly one target table; it isn’t straightforward to define a foreign key that refers to multiple tables. We’d have to write a procedural constraint to enforce this kind of integrity rule. The result of this mismatch of subtypes is that the inheritance structure in your model must be persisted in an SQL database that doesn’t offer an inheritance strategy. Fortunately, three of the inheritance mapping solutions we show in chapter 5 are designed to accommodate the representation of polymorphic associations and the efficient execution of polymorphic queries. The next aspect of the object/relational mismatch problem is the issue of object identity. You probably noticed that we defined USERNAME as the primary key of our USERS table. Was that a good choice? How do we handle identical objects in Java?
1.2.3
The problem of identity
Although the problem of object identity may not be obvious at first, we’ll encounter it often in our growing and expanding e-commerce system, such as when we need to check whether two objects are identical. There are three ways to tackle
The paradigm mismatch
15
this problem: two in the Java world and one in our SQL database. As expected, they work together only with some help. Java objects define two different notions of sameness:
■
Object identity (roughly equivalent to memory location, checked with a==b) Equality as determined by the implementation of the equals() method (also called equality by value)
■
On the other hand, the identity of a database row is expressed as the primary key value. As you’ll see in chapter 9, section 9.2, “Object identity and equality,” neither equals() nor == is naturally equivalent to the primary key value. It’s common for several nonidentical objects to simultaneously represent the same row of the database, for example, in concurrently running application threads. Furthermore, some subtle difficulties are involved in implementing equals() correctly for a persistent class. Let’s discuss another problem related to database identity with an example. In our table definition for USERS, we used USERNAME as a primary key. Unfortunately, this decision makes it difficult to change a username; we need to update not only the USERNAME column in USERS, but also the foreign key column in BILLING_ DETAILS. To solve this problem, later in the book we’ll recommend that you use surrogate keys whenever you can’t find a good natural key (we’ll also discuss what makes a key good). A surrogate key column is a primary key column with no meaning to the user; in other words, a key that isn’t presented to the user and is only used for identification of data inside the software system. For example, we may change our table definitions to look like this:
create table USERS ( USER_ID bigint not null primary key, USERNAME varchar(15) not null unique, NAME varchar(50) not null, ... ) create table BILLING_DETAILS ( BILLING_DETAILS_ID bigint not null primary key, ACCOUNT_NUMBER VARCHAR(10) not null unique, ACCOUNT_NAME VARCHAR(50) not null, ACCOUNT_TYPE VARCHAR(2) not null, USER_ID bigint foreign key references USER )
The USER_ID and BILLING_DETAILS_ID columns contain system-generated values. These columns were introduced purely for the benefit of the data model, so how
16
CHAPTER 1
Understanding object/relational persistence
(if at all) should they be represented in the domain model? We discuss this question in chapter 4, section 4.2, “Mapping entities with identity,” and we find a solution with ORM. In the context of persistence, identity is closely related to how the system handles caching and transactions. Different persistence solutions have chosen different strategies, and this has been an area of confusion. We cover all these interesting topics—and show how they’re related—in chapters 10 and 13. So far, the skeleton e-commerce application we’ve designed has identified the mismatch problems with mapping granularity, subtypes, and object identity. We’re almost ready to move on to other parts of the application, but first we need to discuss the important concept of associations: how the relationships between our classes are mapped and handled. Is the foreign key in the database all you need?
1.2.4
Problems relating to associations
In our domain model, associations represent the relationships between entities. The User, Address, and BillingDetails classes are all associated; but unlike Address, BillingDetails stands on its own. BillingDetails instances are stored in their own table. Association mapping and the management of entity associations are central concepts in any object persistence solution. Object-oriented languages represent associations using object references; but in the relational world, an association is represented as a foreign key column, with copies of key values (and a constraint to guarantee integrity). There are substantial differences between the two representations. Object references are inherently directional; the association is from one object to the other. They’re pointers. If an association between objects should be navigable in both directions, you must define the association twice, once in each of the associated classes. You’ve already seen this in the domain model classes:
public class User { private Set billingDetails; ... } public class BillingDetails { private User user; ... }
On the other hand, foreign key associations aren’t by nature directional. Navigation has no meaning for a relational data model because you can create arbitrary data associations with table joins and projection. The challenge is to bridge a completely open data model, which is independent of the application that works with
The paradigm mismatch
17
the data, to an application-dependent navigational model, a constrained view of the associations needed by this particular application. It isn’t possible to determine the multiplicity of a unidirectional association by looking only at the Java classes. Java associations can have many-to-many multiplicity. For example, the classes could look like this:
public class User { private Set billingDetails; ... } public class BillingDetails { private Set users; ... }
Table associations, on the other hand, are always one-to-many or one-to-one. You can see the multiplicity immediately by looking at the foreign key definition. The following is a foreign key declaration on the BILLING_DETAILS table for a one-tomany association (or, if read in the other direction, a many-to-one association):
USER_ID bigint foreign key references USERS
These are one-to-one associations:
USER_ID bigint unique foreign key references USERS BILLING_DETAILS_ID bigint primary key foreign key references USERS
If you wish to represent a many-to-many association in a relational database, you must introduce a new table, called a link table. This table doesn’t appear anywhere in the domain model. For our example, if we consider the relationship between the user and the billing information to be many-to-many, the link table is defined as follows:
create table USER_BILLING_DETAILS ( USER_ID bigint foreign key references USERS, BILLING_DETAILS_ID bigint foreign key references BILLING_DETAILS, PRIMARY KEY (USER_ID, BILLING_DETAILS_ID) )
We discuss association and collection mappings in great detail in chapters 6 and 7. So far, the issues we’ve considered are mainly structural. We can see them by considering a purely static view of the system. Perhaps the most difficult problem in object persistence is a dynamic problem. It concerns associations, and we’ve already hinted at it when we drew a distinction between object network navigation and table joins in section 1.1.4, “Persistence in object-oriented applications.” Let’s explore this significant mismatch problem in more depth.
18
CHAPTER 1
Understanding object/relational persistence
1.2.5
The problem of data navigation
There is a fundamental difference in the way you access data in Java and in a relational database. In Java, when you access a user’s billing information, you call aUser.getBillingDetails().getAccountNumber() or something similar. This is the most natural way to access object-oriented data, and it’s often described as walking the object network. You navigate from one object to another, following pointers between instances. Unfortunately, this isn’t an efficient way to retrieve data from an SQL database. The single most important thing you can do to improve the performance of data access code is to minimize the number of requests to the database. The most obvious way to do this is to minimize the number of SQL queries. (Of course, there are other more sophisticated ways that follow as a second step.) Therefore, efficient access to relational data with SQL usually requires joins between the tables of interest. The number of tables included in the join when retrieving data determines the depth of the object network you can navigate in memory. For example, if you need to retrieve a User and aren’t interested in the user’s billing information, you can write this simple query:
select * from USERS u where u.USER_ID = 123
On the other hand, if you need to retrieve a User and then subsequently visit each of the associated BillingDetails instances (let’s say, to list all the user’s credit cards), you write a different query:
select * from USERS u left outer join BILLING_DETAILS bd on bd.USER_ID = u.USER_ID where u.USER_ID = 123
As you can see, to efficiently use joins you need to know what portion of the object network you plan to access when you retrieve the initial User—this is before you start navigating the object network! On the other hand, any object persistence solution provides functionality for fetching the data of associated objects only when the object is first accessed. However, this piecemeal style of data access is fundamentally inefficient in the context of a relational database, because it requires executing one statement for each node or collection of the object network that is accessed. This is the dreaded n+1 selects problem. This mismatch in the way you access objects in Java and in a relational database is perhaps the single most common source of performance problems in Java applications. There is a natural tension between too many selects and too big
The paradigm mismatch
19
selects, which retrieve unnecessary information into memory. Yet, although we’ve been blessed with innumerable books and magazine articles advising us to use StringBuffer for string concatenation, it seems impossible to find any advice about strategies for avoiding the n+1 selects problem. Fortunately, Hibernate provides sophisticated features for efficiently and transparently fetching networks of objects from the database to the application accessing them. We discuss these features in chapters 13, 14, and 15.
1.2.6
The cost of the mismatch
We now have quite a list of object/relational mismatch problems, and it will be costly (in time and effort) to find solutions, as you may know from experience. This cost is often underestimated, and we think this is a major reason for many failed software projects. In our experience (regularly confirmed by developers we talk to), the main purpose of up to 30 percent of the Java application code written is to handle the tedious SQL/JDBC and manual bridging of the object/relational paradigm mismatch. Despite all this effort, the end result still doesn’t feel quite right. We’ve seen projects nearly sink due to the complexity and inflexibility of their database abstraction layers. We also see Java developers (and DBAs) quickly lose their confidence when design decisions about the persistence strategy for a project have to be made. One of the major costs is in the area of modeling. The relational and domain models must both encompass the same business entities, but an object-oriented purist will model these entities in a different way than an experienced relational data modeler would. The usual solution to this problem is to bend and twist the domain model and the implemented classes until they match the SQL database schema. (Which, following the principle of data independence, is certainly a safe long-term choice.) This can be done successfully, but only at the cost of losing some of the advantages of object orientation. Keep in mind that relational modeling is underpinned by relational theory. Object orientation has no such rigorous mathematical definition or body of theoretical work, so we can’t look to mathematics to explain how we should bridge the gap between the two paradigms—there is no elegant transformation waiting to be discovered. (Doing away with Java and SQL, and starting from scratch isn’t considered elegant.) The domain modeling mismatch isn’t the only source of the inflexibility and the lost productivity that lead to higher costs. A further cause is the JDBC API itself. JDBC and SQL provide a statement-oriented (that is, command-oriented) approach to moving data to and from an SQL database. If you want to query or
20
CHAPTER 1
Understanding object/relational persistence
manipulate data, the tables and columns involved must be specified at least three times (insert, update, select), adding to the time required for design and implementation. The distinct dialects for every SQL database management system don’t improve the situation. To round out your understanding of object persistence, and before we approach possible solutions, we need to discuss application architecture and the role of a persistence layer in typical application design.
1.3
Persistence layers and alternatives
In a medium- or large-sized application, it usually makes sense to organize classes by concern. Persistence is one concern; others include presentation, workflow, and business logic.1 A typical object-oriented architecture includes layers of code that represent the concerns. It’s normal and certainly best practice to group all classes and components responsible for persistence into a separate persistence layer in a layered system architecture. In this section, we first look at the layers of this type of architecture and why we use them. After that, we focus on the layer we’re most interested in—the persistence layer—and some of the ways it can be implemented.
1.3.1
Layered architecture
A layered architecture defines interfaces between code that implements the various concerns, allowing changes to be made to the way one concern is implemented without significant disruption to code in the other layers. Layering also determines the kinds of interlayer dependencies that occur. The rules are as follows:
■
Layers communicate from top to bottom. A layer is dependent only on the layer directly below it. Each layer is unaware of any other layers except for the layer just below it.
■
Different systems group concerns differently, so they define different layers. A typical, proven, high-level application architecture uses three layers: one each for presentation, business logic, and persistence, as shown in figure 1.4. Let’s take a closer look at the layers and elements in the diagram:
1
There are also the so-called cross-cutting concerns, which may be implemented generically—by framework code, for example. Typical cross-cutting concerns include logging, authorization, and transaction demarcation.
Persistence layers and alternatives
21
Figure 1.4 A persistence layer is the basis in a layered architecture
■
Presentation layer—The user interface logic is topmost. Code responsible for the presentation and control of page and screen navigation is in the presentation layer. Business layer—The exact form of the next layer varies widely between applications. It’s generally agreed, however, that the business layer is responsible for implementing any business rules or system requirements that would be understood by users as part of the problem domain. This layer usually includes some kind of controlling component—code that knows when to invoke which business rule. In some systems, this layer has its own internal representation of the business domain entities, and in others it reuses the model defined by the persistence layer. We revisit this issue in chapter 3. Persistence layer—The persistence layer is a group of classes and components responsible for storing data to, and retrieving it from, one or more data stores. This layer necessarily includes a model of the business domain entities (even if it’s only a metadata model). Database—The database exists outside the Java application itself. It’s the actual, persistent representation of the system state. If an SQL database is used, the database includes the relational schema and possibly stored procedures. Helper and utility classes—Every application has a set of infrastructural helper or utility classes that are used in every layer of the application (such as Exception classes for error handling). These infrastructural elements don’t form a layer, because they don’t obey the rules for interlayer dependency in a layered architecture.
■
■
■
■
22
CHAPTER 1
Understanding object/relational persistence
Let’s now take a brief look at the various ways the persistence layer can be implemented by Java applications. Don’t worry—we’ll get to ORM and Hibernate soon. There is much to be learned by looking at other approaches.
1.3.2
Hand-coding a persistence layer with SQL/JDBC
The most common approach to Java persistence is for application programmers to work directly with SQL and JDBC. After all, developers are familiar with relational database management systems, they understand SQL, and they know how to work with tables and foreign keys. Moreover, they can always use the well-known and widely used data access object (DAO) pattern to hide complex JDBC code and nonportable SQL from the business logic. The DAO pattern is a good one—so good that we often recommend its use even with ORM. However, the work involved in manually coding persistence for each domain class is considerable, particularly when multiple SQL dialects are supported. This work usually ends up consuming a large portion of the development effort. Furthermore, when requirements change, a hand-coded solution always requires more attention and maintenance effort. Why not implement a simple mapping framework to fit the specific requirements of your project? The result of such an effort could even be reused in future projects. Many developers have taken this approach; numerous homegrown object/relational persistence layers are in production systems today. However, we don’t recommend this approach. Excellent solutions already exist: not only the (mostly expensive) tools sold by commercial vendors, but also open source projects with free licenses. We’re certain you’ll be able to find a solution that meets your requirements, both business and technical. It’s likely that such a solution will do a great deal more, and do it better, than a solution you could build in a limited time. Developing a reasonably full-featured ORM may take many developers months. For example, Hibernate is about 80,000 lines of code, some of which is much more difficult than typical application code, along with 25,000 lines of unit test code. This may be more code than is in your application. A great many details can easily be overlooked in such a large project—as both the authors know from experience! Even if an existing tool doesn’t fully implement two or three of your more exotic requirements, it’s still probably not worth creating your own tool. Any ORM software will handle the tedious common cases—the ones that kill productivity. It’s OK if you need to hand-code certain special cases; few applications are composed primarily of special cases.
Persistence layers and alternatives
23
1.3.3
Using serialization
Java has a built-in persistence mechanism: Serialization provides the ability to write a snapshot of a network of objects (the state of the application) to a byte stream, which may then be persisted to a file or database. Serialization is also used by Java’s Remote Method Invocation (RMI) to achieve pass-by value semantics for complex objects. Another use of serialization is to replicate application state across nodes in a cluster of machines. Why not use serialization for the persistence layer? Unfortunately, a serialized network of interconnected objects can only be accessed as a whole; it’s impossible to retrieve any data from the stream without deserializing the entire stream. Thus, the resulting byte stream must be considered unsuitable for arbitrary search or aggregation of large datasets. It isn’t even possible to access or update a single object or subset of objects independently. Loading and overwriting an entire object network in each transaction is no option for systems designed to support high concurrency. Given current technology, serialization is inadequate as a persistence mechanism for high concurrency web and enterprise applications. It has a particular niche as a suitable persistence mechanism for desktop applications.
1.3.4
Object-oriented database systems
Because we work with objects in Java, it would be ideal if there were a way to store those objects in a database without having to bend and twist the object model at all. In the mid-1990s, object-oriented database systems gained attention. They’re based on a network data model, which was common before the advent of the relational data model decades ago. The basic idea is to store a network of objects, with all its pointers and nodes, and to re-create the same in-memory graph later on. This can be optimized with various metadata and configuration settings. An object-oriented database management system (OODBMS) is more like an extension to the application environment than an external data store. An OODBMS usually features a multitiered implementation, with the backend data store, object cache, and client application coupled tightly together and interacting via a proprietary network protocol. Object nodes are kept on pages of memory, which are transported from and to the data store. Object-oriented database development begins with the top-down definition of host language bindings that add persistence capabilities to the programming language. Hence, object databases offer seamless integration into the object-oriented application environment. This is different from the model used by today’s
24
CHAPTER 1
Understanding object/relational persistence
relational databases, where interaction with the database occurs via an intermediate language (SQL) and data independence from a particular application is the major concern. For background information on object-oriented databases, we recommend the respective chapter in An Introduction to Database Systems (Date, 2003). We won’t bother looking too closely into why object-oriented database technology hasn’t been more popular; we’ll observe that object databases haven’t been widely adopted and that it doesn’t appear likely that they will be in the near future. We’re confident that the overwhelming majority of developers will have far more opportunity to work with relational technology, given the current political realities (predefined deployment environments) and the common requirement for data independence.
1.3.5
Other options
Of course, there are other kinds of persistence layers. XML persistence is a variation on the serialization theme; this approach addresses some of the limitations of byte-stream serialization by allowing easy access to the data through a standardized tool interface. However, managing data in XML would expose you to an object/hierarchical mismatch. Furthermore, there is no additional benefit from the XML itself, because it’s just another text file format and has no inherent capabilities for data management. You can use stored procedures (even writing them in Java, sometimes) and move the problem into the database tier. So-called object-relational databases have been marketed as a solution, but they offer only a more sophisticated datatype system providing only half the solution to our problems (and further muddling terminology). We’re sure there are plenty of other examples, but none of them are likely to become popular in the immediate future. Political and economic constraints (long-term investments in SQL databases), data independence, and the requirement for access to valuable legacy data call for a different approach. ORM may be the most practical solution to our problems.
1.4
Object/relational mapping
Now that we’ve looked at the alternative techniques for object persistence, it’s time to introduce the solution we feel is the best, and the one we use with Hibernate: ORM. Despite its long history (the first research papers were published in the late 1980s), the terms for ORM used by developers vary. Some call it object relational mapping, others prefer the simple object mapping; we exclusively use
Object/relational mapping
25
the term object/relational mapping and its acronym, ORM. The slash stresses the mismatch problem that occurs when the two worlds collide. In this section, we first look at what ORM is. Then we enumerate the problems that a good ORM solution needs to solve. Finally, we discuss the general benefits that ORM provides and why we recommend this solution.
1.4.1
What is ORM?
In a nutshell, object/relational mapping is the automated (and transparent) persistence of objects in a Java application to the tables in a relational database, using metadata that describes the mapping between the objects and the database. ORM, in essence, works by (reversibly) transforming data from one representation to another. This implies certain performance penalties. However, if ORM is implemented as middleware, there are many opportunities for optimization that wouldn’t exist for a hand-coded persistence layer. The provision and management of metadata that governs the transformation adds to the overhead at development time, but the cost is less than equivalent costs involved in maintaining a hand-coded solution. (And even object databases require significant amounts of metadata.)
FAQ
Isn’t ORM a Visio plug-in? The acronym ORM can also mean object role modeling, and this term was invented before object/relational mapping became relevant. It describes a method for information analysis, used in database modeling, and is primarily supported by Microsoft Visio, a graphical modeling tool. Database specialists use it as a replacement or as an addition to the more popular entity-relationship modeling. However, if you talk to Java developers about ORM, it’s usually in the context of object/relational mapping.
An ORM solution consists of the following four pieces:
■
An API for performing basic CRUD operations on objects of persistent classes A language or API for specifying queries that refer to classes and properties of classes A facility for specifying mapping metadata A technique for the ORM implementation to interact with transactional objects to perform dirty checking, lazy association fetching, and other optimization functions
■
■ ■
26
CHAPTER 1
Understanding object/relational persistence
We’re using the term full ORM to include any persistence layer where SQL is automatically generated from a metadata-based description. We aren’t including persistence layers where the object/relational mapping problem is solved manually by developers hand-coding SQL with JDBC. With ORM, the application interacts with the ORM APIs and the domain model classes and is abstracted from the underlying SQL/JDBC. Depending on the features or the particular implementation, the ORM engine may also take on responsibility for issues such as optimistic locking and caching, relieving the application of these concerns entirely. Let’s look at the various ways ORM can be implemented. Mark Fussel (Fussel, 1997), a developer in the field of ORM, defined the following four levels of ORM quality. We have slightly rewritten his descriptions and put them in the context of today’s Java application development. Pure relational The whole application, including the user interface, is designed around the relational model and SQL-based relational operations. This approach, despite its deficiencies for large systems, can be an excellent solution for simple applications where a low level of code reuse is tolerable. Direct SQL can be fine-tuned in every aspect, but the drawbacks, such as lack of portability and maintainability, are significant, especially in the long run. Applications in this category often make heavy use of stored procedures, shifting some of the work out of the business layer and into the database. Light object mapping Entities are represented as classes that are mapped manually to the relational tables. Hand-coded SQL/JDBC is hidden from the business logic using wellknown design patterns. This approach is extremely widespread and is successful for applications with a small number of entities, or applications with generic, metadata-driven data models. Stored procedures may have a place in this kind of application. Medium object mapping The application is designed around an object model. SQL is generated at build time using a code-generation tool, or at runtime by framework code. Associations between objects are supported by the persistence mechanism, and queries may be specified using an object-oriented expression language. Objects are cached by the persistence layer. A great many ORM products and homegrown persistence layers support at least this level of functionality. It’s well suited to medium-sized
Object/relational mapping
27
applications with some complex transactions, particularly when portability between different database products is important. These applications usually don’t use stored procedures. Full object mapping Full object mapping supports sophisticated object modeling: composition, inheritance, polymorphism, and persistence by reachability. The persistence layer implements transparent persistence; persistent classes do not inherit from any special base class or have to implement a special interface. Efficient fetching strategies (lazy, eager, and prefetching) and caching strategies are implemented transparently to the application. This level of functionality can hardly be achieved by a homegrown persistence layer—it’s equivalent to years of development time. A number of commercial and open source Java ORM tools have achieved this level of quality. This level meets the definition of ORM we’re using in this book. Let’s look at the problems we expect to be solved by a tool that achieves full object mapping.
1.4.2
Generic ORM problems
The following list of issues, which we’ll call the ORM problems, identifies the fundamental questions resolved by a full object/relational mapping tool in a Java environment. Particular ORM tools may provide extra functionality (for example, aggressive caching), but this is a reasonably exhaustive list of the conceptual issues and questions that are specific to object/relational mapping.
1
What do persistent classes look like? How transparent is the persistence tool? Do we have to adopt a programming model and conventions for classes of the business domain? How is mapping metadata defined? Because the object/relational transformation is governed entirely by metadata, the format and definition of this metadata is important. Should an ORM tool provide a GUI interface to manipulate the metadata graphically? Or are there better approaches to metadata definition? How do object identity and equality relate to database (primary key) identity? How do we map instances of particular classes to particular table rows? How should we map class inheritance hierarchies? There are several standard strategies. What about polymorphic associations, abstract classes, and interfaces?
2
3
4
28
CHAPTER 1
Understanding object/relational persistence
5
How does the persistence logic interact at runtime with the objects of the business domain? This is a problem of generic programming, and there are a number of solutions including source generation, runtime reflection, runtime bytecode generation, and build-time bytecode enhancement. The solution to this problem may affect your build process (but, preferably, shouldn’t otherwise affect you as a user). What is the lifecycle of a persistent object? Does the lifecycle of some objects depend upon the lifecycle of other associated objects? How do we translate the lifecycle of an object to the lifecycle of a database row? What facilities are provided for sorting, searching, and aggregating? The application could do some of these things in memory, but efficient use of relational technology requires that this work often be performed by the database. How do we efficiently retrieve data with associations? Efficient access to relational data is usually accomplished via table joins. Object-oriented applications usually access data by navigating an object network. Two data access patterns should be avoided when possible: the n+1 selects problem, and its complement, the Cartesian product problem (fetching too much data in a single select).
6
7
8
Two additional issues that impose fundamental constraints on the design and architecture of an ORM tool are common to any data access technology:
■ ■
Transactions and concurrency Cache management (and concurrency)
As you can see, a full object/relational mapping tool needs to address quite a long list of issues. By now, you should be starting to see the value of ORM. In the next section, we look at some of the other benefits you gain when you use an ORM solution.
1.4.3
Why ORM?
An ORM implementation is a complex beast—less complex than an application server, but more complex than a web application framework like Struts or Tapestry. Why should we introduce another complex infrastructural element into our system? Will it be worth it? It will take us most of this book to provide a complete answer to those questions, but this section provides a quick summary of the most compelling benefits. First, though, let’s quickly dispose of a nonbenefit.
Object/relational mapping
29
A supposed advantage of ORM is that it shields developers from messy SQL. This view holds that object-oriented developers can’t be expected to understand SQL or relational databases well, and that they find SQL somehow offensive. On the contrary, we believe that Java developers must have a sufficient level of familiarity with—and appreciation of—relational modeling and SQL in order to work with ORM. ORM is an advanced technique to be used by developers who have already done it the hard way. To use Hibernate effectively, you must be able to view and interpret the SQL statements it issues and understand the implications for performance. Now, let’s look at some of the benefits of ORM and Hibernate. Productivity Persistence-related code can be perhaps the most tedious code in a Java application. Hibernate eliminates much of the grunt work (more than you’d expect) and lets you concentrate on the business problem. No matter which application-development strategy you prefer—top-down, starting with a domain model, or bottom-up, starting with an existing database schema—Hibernate, used together with the appropriate tools, will significantly reduce development time. Maintainability Fewer lines of code (LOC) make the system more understandable, because it emphasizes business logic rather than plumbing. Most important, a system with less code is easier to refactor. Automated object/relational persistence substantially reduces LOC. Of course, counting lines of code is a debatable way of measuring application complexity. However, there are other reasons that a Hibernate application is more maintainable. In systems with hand-coded persistence, an inevitable tension exists between the relational representation and the object model implementing the domain. Changes to one almost always involve changes to the other, and often the design of one representation is compromised to accommodate the existence of the other. (What almost always happens in practice is that the object model of the domain is compromised.) ORM provides a buffer between the two models, allowing more elegant use of object orientation on the Java side, and insulating each model from minor changes to the other. Performance A common claim is that hand-coded persistence can always be at least as fast, and can often be faster, than automated persistence. This is true in the same sense that
30
CHAPTER 1
Understanding object/relational persistence
it’s true that assembly code can always be at least as fast as Java code, or a handwritten parser can always be at least as fast as a parser generated by YACC or ANTLR—in other words, it’s beside the point. The unspoken implication of the claim is that hand-coded persistence will perform at least as well in an actual application. But this implication will be true only if the effort required to implement at-least-as-fast hand-coded persistence is similar to the amount of effort involved in utilizing an automated solution. The really interesting question is what happens when we consider time and budget constraints? Given a persistence task, many optimizations are possible. Some (such as query hints) are much easier to achieve with hand-coded SQL/JDBC. Most optimizations, however, are much easier to achieve with automated ORM. In a project with time constraints, hand-coded persistence usually allows you to make some optimizations. Hibernate allows many more optimizations to be used all the time. Furthermore, automated persistence improves developer productivity so much that you can spend more time hand-optimizing the few remaining bottlenecks. Finally, the people who implemented your ORM software probably had much more time to investigate performance optimizations than you have. Did you know, for instance, that pooling PreparedStatement instances results in a significant performance increase for the DB2 JDBC driver but breaks the InterBase JDBC driver? Did you realize that updating only the changed columns of a table can be significantly faster for some databases but potentially slower for others? In your handcrafted solution, how easy is it to experiment with the impact of these various strategies? Vendor independence An ORM abstracts your application away from the underlying SQL database and SQL dialect. If the tool supports a number of different databases (and most do), this confers a certain level of portability on your application. You shouldn’t necessarily expect write-once/run-anywhere, because the capabilities of databases differ, and achieving full portability would require sacrificing some of the strength of the more powerful platforms. Nevertheless, it’s usually much easier to develop a cross-platform application using ORM. Even if you don’t require cross-platform operation, an ORM can still help mitigate some of the risks associated with vendor lock-in. In addition, database independence helps in development scenarios where developers use a lightweight local database but deploy for production on a different database.
Object/relational mapping
31
You need to select an ORM product at some point. To make an educated decision, you need a list of the software modules and standards that are available.
1.4.4
Introducing Hibernate, EJB3, and JPA
Hibernate is a full object/relational mapping tool that provides all the previously listed ORM benefits. The API you’re working with in Hibernate is native and designed by the Hibernate developers. The same is true for the query interfaces and query languages, and for how object/relational mapping metadata is defined. Before you start your first project with Hibernate, you should consider the EJB 3.0 standard and its subspecification, Java Persistence. Let’s go back in history and see how this new standard came into existence. Many Java developers considered EJB 2.1 entity beans as one of the technologies for the implementation of a persistence layer. The whole EJB programming and persistence model has been widely adopted in the industry, and it has been an important factor in the success of J2EE (or, Java EE as it’s now called). However, over the last years, critics of EJB in the developer community became more vocal (especially with regard to entity beans and persistence), and companies realized that the EJB standard should be improved. Sun, as the steering party of J2EE, knew that an overhaul was in order and started a new Java specification request (JSR) with the goal of simplifying EJB in early 2003. This new JSR, Enterprise JavaBeans 3.0 (JSR 220), attracted significant interest. Developers from the Hibernate team joined the expert group early on and helped shape the new specification. Other vendors, including all major and many smaller companies in the Java industry, also contributed to the effort. An important decision made for the new standard was to specify and standardize things that work in practice, taking ideas and concepts from existing successful products and projects. Hibernate, therefore, being a successful data persistence solution, played an important role for the persistence part of the new standard. But what exactly is the relationship between Hibernate and EJB3, and what is Java Persistence? Understanding the standards First, it’s difficult (if not impossible) to compare a specification and a product. The questions that should be asked are, “Does Hibernate implement the EJB 3.0 specification, and what is the impact on my project? Do I have to use one or the other?” The new EJB 3.0 specification comes in several parts: The first part defines the new EJB programming model for session beans and message-driven beans, the deployment rules, and so on. The second part of the specification deals with persistence exclusively: entities, object/relational mapping metadata, persistence
32
CHAPTER 1
Understanding object/relational persistence
manager interfaces, and the query language. This second part is called Java Persistence API (JPA), probably because its interfaces are in the package javax.persistence. We’ll use this acronym throughout the book. This separation also exists in EJB 3.0 products; some implement a full EJB 3.0 container that supports all parts of the specification, and other products may implement only the Java Persistence part. Two important principles were designed into the new standard:
■
JPA engines should be pluggable, which means you should be able to take
out one product and replace it with another if you aren’t satisfied—even if you want to stay with the same EJB 3.0 container or Java EE 5.0 application server.
■
JPA engines should be able to run outside of an EJB 3.0 (or any other) runtime environment, without a container in plain standard Java.
The consequences of this design are that there are more options for developers and architects, which drives competition and therefore improves overall quality of products. Of course, actual products also offer features that go beyond the specification as vendor-specific extensions (such as for performance tuning, or because the vendor has a focus on a particular vertical problem space). Hibernate implements Java Persistence, and because a JPA engine must be pluggable, new and interesting combinations of software are possible. You can select from various Hibernate software modules and combine them depending on your project’s technical and business requirements. Hibernate Core The Hibernate Core is also known as Hibernate 3.2.x, or Hibernate. It’s the base service for persistence, with its native API and its mapping metadata stored in XML files. It has a query language called HQL (almost the same as SQL), as well as programmatic query interfaces for Criteria and Example queries. There are hundreds of options and features available for everything, as Hibernate Core is really the foundation and the platform all other modules are built on. You can use Hibernate Core on its own, independent from any framework or any particular runtime environment with all JDKs. It works in every Java EE/J2EE application server, in Swing applications, in a simple servlet container, and so on. As long as you can configure a data source for Hibernate, it works. Your application code (in your persistence layer) will use Hibernate APIs and queries, and your mapping metadata is written in native Hibernate XML files.
Object/relational mapping
33
Native Hibernate APIs, queries, and XML mapping files are the primary focus of this book, and they’re explained first in all code examples. The reason for that is that Hibernate functionality is a superset of all other available options. Hibernate Annotations A new way to define application metadata became available with JDK 5.0: type-safe annotations embedded directly in the Java source code. Many Hibernate users are already familiar with this concept, as the XDoclet software supports Javadoc metadata attributes and a preprocessor at compile time (which, for Hibernate, generates XML mapping files). With the Hibernate Annotations package on top of Hibernate Core, you can now use type-safe JDK 5.0 metadata as a replacement or in addition to native Hibernate XML mapping files. You’ll find the syntax and semantics of the mapping annotations familiar once you’ve seen them side-by-side with Hibernate XML mapping files. However, the basic annotations aren’t proprietary. The JPA specification defines object/relational mapping metadata syntax and semantics, with the primary mechanism being JDK 5.0 annotations. (Yes, JDK 5.0 is required for Java EE 5.0 and EJB 3.0.) Naturally, the Hibernate Annotations are a set of basic annotations that implement the JPA standard, and they’re also a set of extension annotations you need for more advanced and exotic Hibernate mappings and tuning. You can use Hibernate Core and Hibernate Annotations to reduce your lines of code for mapping metadata, compared to the native XML files, and you may like the better refactoring capabilities of annotations. You can use only JPA annotations, or you can add a Hibernate extension annotation if complete portability isn’t your primary concern. (In practice, you should embrace the product you’ve chosen instead of denying its existence at all times.) We’ll discuss the impact of annotations on your development process, and how to use them in mappings, throughout this book, along with native Hibernate XML mapping examples. Hibernate EntityManager The JPA specification also defines programming interfaces, lifecycle rules for persistent objects, and query features. The Hibernate implementation for this part of JPA is available as Hibernate EntityManager, another optional module you can stack on top of Hibernate Core. You can fall back when a plain Hibernate interface, or even a JDBC Connection is needed. Hibernate’s native features are a superset of the JPA persistence features in every respect. (The simple fact is that
34
CHAPTER 1
Understanding object/relational persistence
Hibernate EntityManager is a small wrapper around Hibernate Core that provides JPA compatibility.) Working with standardized interfaces and using a standardized query language has the benefit that you can execute your JPA-compatible persistence layer with any EJB 3.0 compliant application server. Or, you can use JPA outside of any particular standardized runtime environment in plain Java (which really means everywhere Hibernate Core can be used). Hibernate Annotations should be considered in combination with Hibernate EntityManager. It’s unusual that you’d write your application code against JPA interfaces and with JPA queries, and not create most of your mappings with JPA annotations. Java EE 5.0 application servers We don’t cover all of EJB 3.0 in this book; our focus is naturally on persistence, and therefore on the JPA part of the specification. (We will, of course, show you many techniques with managed EJB components when we talk about application architecture and design.) Hibernate is also part of the JBoss Application Server (JBoss AS), an implementation of J2EE 1.4 and (soon) Java EE 5.0. A combination of Hibernate Core, Hibernate Annotations, and Hibernate EntityManager forms the persistence engine of this application server. Hence, everything you can use stand-alone, you can also use inside the application server with all the EJB 3.0 benefits, such as session beans, message-driven beans, and other Java EE services. To complete the picture, you also have to understand that Java EE 5.0 application servers are no longer the monolithic beasts of the J2EE 1.4 era. In fact, the JBoss EJB 3.0 container also comes in an embeddable version, which runs inside other application servers, and even in Tomcat, or in a unit test, or a Swing application. In the next chapter, you’ll prepare a project that utilizes EJB 3.0 components, and you’ll install the JBoss server for easy integration testing. As you can see, native Hibernate features implement significant parts of the specification or are natural vendor extensions, offering additional functionality if required. Here is a simple trick to see immediately what code you’re looking at, whether JPA or native Hibernate. If only the javax.persistence.* import is visible, you’re working inside the specification; if you also import org.hibernate.*, you’re using native Hibernate functionality. We’ll later show you a few more tricks that will help you cleanly separate portable from vendor-specific code.
Summary
35
FAQ
What is the future of Hibernate? Hibernate Core will be developed independently from and faster than the EJB 3.0 or Java Persistence specifications. It will be the testing ground for new ideas, as it has always been. Any new feature developed for Hibernate Core is immediately and automatically available as an extension for all users of Java Persistence with Hibernate Annotations and Hibernate EntityManager. Over time, if a particular concept has proven its usefulness, Hibernate developers will work with other expert group members on future standardization in an updated EJB or Java Persistence specification. Hence, if you’re interested in a quickly evolving standard, we encourage you to use native Hibernate functionality, and to send feedback to the respective expert group. The desire for total portability and the rejection of vendor extensions were major reasons for the stagnation we saw in EJB 1.x and 2.x.
After so much praise of ORM and Hibernate, it’s time to look at some actual code. It’s time to wrap up the theory and to set up a first project.
1.5
Summary
In this chapter, we’ve discussed the concept of object persistence and the importance of ORM as an implementation technique. Object persistence means that individual objects can outlive the application process; they can be saved to a data store and be re-created at a later point in time. The object/relational mismatch comes into play when the data store is an SQL-based relational database management system. For instance, a network of objects can’t be saved to a database table; it must be disassembled and persisted to columns of portable SQL datatypes. A good solution for this problem is object/relational mapping (ORM), which is especially helpful if we consider richly typed Java domain models. A domain model represents the business entities used in a Java application. In a layered system architecture, the domain model is used to execute business logic in the business layer (in Java, not in the database). This business layer communicates with the persistence layer beneath in order to load and store the persistent objects of the domain model. ORM is the middleware in the persistence layer that manages the persistence. ORM isn’t a silver bullet for all persistence tasks; its job is to relieve the developer of 95 percent of object persistence work, such as writing complex SQL statements with many table joins, and copying values from JDBC result sets to objects or graphs of objects. A full-featured ORM middleware solution may provide database portability, certain optimization techniques like caching, and other viable functions that aren’t easy to hand-code in a limited time with SQL and JDBC.
36
CHAPTER 1
Understanding object/relational persistence
It’s likely that a better solution than ORM will exist some day. We (and many others) may have to rethink everything we know about SQL, persistence API standards, and application integration. The evolution of today’s systems into true relational database systems with seamless object-oriented integration remains pure speculation. But we can’t wait, and there is no sign that any of these issues will improve soon (a multibillion dollar industry isn’t very agile). ORM is the best solution currently available, and it’s a timesaver for developers facing the object/relational mismatch every day. With EJB 3.0, a specification for full object/relational mapping software that is accepted in the Java industry is finally available.
Starting a project
This chapter covers
■
“Hello World” with Hibernate and Java Persistence The toolset for forward and reverse engineering Hibernate configuration and integration
■ ■
37
38
CHAPTER 2
Starting a project
You want to start using Hibernate and Java Persistence, and you want to learn it with a step-by-step example. You want to see both persistence APIs and how you can benefit from native Hibernate or standardized JPA. This is what you’ll find in this chapter: a tour through a straightforward “Hello World” application. However, a good and complete tutorial is already publicly available in the Hibernate reference documentation, so instead of repeating it here, we show you more detailed instructions about Hibernate integration and configuration along the way. If you want to start with a less elaborate tutorial that you can complete in one hour, our advice is to consider the Hibernate reference documentation. It takes you from a simple stand-alone Java application with Hibernate through the most essential mapping concepts and finally demonstrates a Hibernate web application deployed on Tomcat. In this chapter, you’ll learn how to set up a project infrastructure for a plain Java application that integrates Hibernate, and you’ll see many more details about how Hibernate can be configured in such an environment. We also discuss configuration and integration of Hibernate in a managed environment—that is, an environment that provides Java EE services. As a build tool for the “Hello World” project, we introduce Ant and create build scripts that can not only compile and run the project, but also utilize the Hibernate Tools. Depending on your development process, you’ll use the Hibernate toolset to export database schemas automatically or even to reverse-engineer a complete application from an existing (legacy) database schema. Like every good engineer, before you start your first real Hibernate project you should prepare your tools and decide what your development process is going to look like. And, depending on the process you choose, you may naturally prefer different tools. Let’s look at this preparation phase and what your options are, and then start a Hibernate project.
2.1
Starting a Hibernate project
In some projects, the development of an application is driven by developers analyzing the business domain in object-oriented terms. In others, it’s heavily influenced by an existing relational data model: either a legacy database or a brandnew schema designed by a professional data modeler. There are many choices to be made, and the following questions need to be answered before you can start:
■
Can you start from scratch with a clean design of a new business requirement, or is legacy data and/or legacy application code present?
Starting a Hibernate project
39
■
Can some of the necessary pieces be automatically generated from an existing artifact (for example, Java source from an existing database schema)? Can the database schema be generated from Java code and Hibernate mapping metadata? What kind of tool is available to support this work? What about other tools to support the full development cycle?
■
We’ll discuss these questions in the following sections as we set up a basic Hibernate project. This is your road map:
1 2 3 4 5
Select a development process Set up the project infrastructure Write application code and mappings Configure and start Hibernate Run the application.
After reading the next sections, you’ll be prepared for the correct approach in your own project, and you’ll also have the background information for more complex scenarios we’ll touch on later in this chapter.
2.1.1
Selecting a development process
Let’s first get an overview of the available tools, the artifacts they use as source input, and the output that is produced. Figure 2.1 shows various import and
UML Model XML/XMI AndroMDA Persistent Class Java Source Mapping Metadata Annotations Data Access Object Java Source
Documentation HTML
Hibernate Metamodel Database Schema Mapping Metadata XML
Configuration XML
Freemarker Template
Figure 2.1
Input and output of the tools used for Hibernate development
40
CHAPTER 2
Starting a project
export tasks for Ant; all the functionality is also available with the Hibernate Tools plug-ins for Eclipse. Refer to this diagram while reading this chapter.1
NOTE
Hibernate Tools for Eclipse IDE —The Hibernate Tools are plug-ins for the Eclipse IDE (part of the JBoss IDE for Eclipse—a set of wizards, editors, and extra views in Eclipse that help you develop EJB3, Hibernate, JBoss Seam, and other Java applications based on JBoss middleware). The features for forward and reverse engineering are equivalent to the Ant-based tools. The additional Hibernate Console view allows you to execute ad hoc Hibernate queries (HQL and Criteria) against your database and to browse the result graphically. The Hibernate Tools XML editor supports automatic completion of mapping files, including class, property, and even table and column names. The graphical tools were still in development and available as a beta release during the writing of this book, however, so any screenshots would be obsolete with future releases of the software. The documentation of the Hibernate Tools contains many screenshots and detailed project setup instructions that you can easily adapt to create your first “Hello World” program with the Eclipse IDE.
The following development scenarios are common:
■
Top down—In top-down development, you start with an existing domain model, its implementation in Java, and (ideally) complete freedom with respect to the database schema. You must create mapping metadata— either with XML files or by annotating the Java source—and then optionally let Hibernate’s hbm2ddl tool generate the database schema. In the absence of an existing database schema, this is the most comfortable development style for most Java developers. You may even use the Hibernate Tools to automatically refresh the database schema on every application restart in development. Bottom up—Conversely, bottom-up development begins with an existing database schema and data model. In this case, the easiest way to proceed is to use the reverse-engineering tools to extract metadata from the database. This metadata can be used to generate XML mapping files, with hbm2hbmxml for example. With hbm2java, the Hibernate mapping metadata is used to generate Java persistent classes, and even data access objects—in other words, a skeleton for a Java persistence layer. Or, instead of writing to XML
■
1
Note that AndroMDA, a tool that generates POJO source code from UML diagram files, isn’t strictly considered part of the common Hibernate toolset, so it isn’t discussed in this chapter. See the community area on the Hibernate website for more information about the Hibernate module for AndroMDA.
Starting a Hibernate project
41
mapping files, annotated Java source code (EJB 3.0 entity classes) can be produced directly by the tools. However, not all class association details and Java-specific metainformation can be automatically generated from an SQL database schema with this strategy, so expect some manual work.
■
Middle out—The Hibernate XML mapping metadata provides sufficient information to completely deduce the database schema and to generate the Java source code for the persistence layer of the application. Furthermore, the XML mapping document isn’t too verbose. Hence, some architects and developers prefer middle-out development, where they begin with handwritten Hibernate XML mapping files, and then generate the database schema using hbm2ddl and Java classes using hbm2java. The Hibernate XML mapping files are constantly updated during development, and other artifacts are generated from this master definition. Additional business logic or database objects are added through subclassing and auxiliary DDL. This development style can be recommended only for the seasoned Hibernate expert. Meet in the middle—The most difficult scenario is combining existing Java classes and an existing database schema. In this case, there is little that the Hibernate toolset can do to help. It is, of course, not possible to map arbitrary Java domain models to a given schema, so this scenario usually requires at least some refactoring of the Java classes, database schema, or both. The mapping metadata will almost certainly need to be written by hand and in XML files (though it might be possible to use annotations if there is a close match). This can be an incredibly painful scenario, and it is, fortunately, exceedingly rare.
■
We now explore the tools and their configuration options in more detail and set up a work environment for typical Hibernate application development. You can follow our instructions step by step and create the same environment, or you can take only the bits and pieces you need, such as the Ant build scripts. The development process we assume first is top down, and we’ll walk through a Hibernate project that doesn’t involve any legacy data schemas or Java code. After that, you’ll migrate the code to JPA and EJB 3.0, and then you’ll start a project bottom up by reverse-engineering from an existing database schema.
2.1.2
Setting up the project
We assume that you’ve downloaded the latest production release of Hibernate from the Hibernate website at http://www.hibernate.org/ and that you unpacked the archive. You also need Apache Ant installed on your development machine.
42
CHAPTER 2
Starting a project
You should also download a current version of HSQLDB from http://hsqldb.org/ and extract the package; you’ll use this database management system for your tests. If you have another database management system already installed, you only need to obtain a JDBC driver for it. Instead of the sophisticated application you’ll develop later in the book, you’ll get started with a “Hello World” example. That way, you can focus on the development process without getting distracted by Hibernate details. Let’s set up the project directory first. Creating the work directory Create a new directory on your system, in any location you like; C:\helloworld is a good choice if you work on Microsoft Windows. We’ll refer to this directory as WORKDIR in future examples. Create lib and src subdirectories, and copy all required libraries:
WORKDIR +lib antlr.jar asm.jar asm-attrs.jars c3p0.jar cglib.jar commons-collections.jar commons-logging.jar dom4j.jar hibernate3.jar hsqldb.jar jta.jar +src
The libraries you see in the library directory are from the Hibernate distribution, most of them required for a typical Hibernate project. The hsqldb.jar file is from the HSQLDB distribution; replace it with a different driver JAR if you want to use a different database management system. Keep in mind that some of the libraries you’re seeing here may not be required for the particular version of Hibernate you’re working with, which is likely a newer release than we used when writing this book. To make sure you have the right set of libraries, always check the lib/ README.txt file in the Hibernate distribution package. This file contains an up-todate list of all required and optional third-party libraries for Hibernate—you only need the libraries listed as required for runtime. In the “Hello World” application, you want to store messages in the database and load them from the database. You need to create the domain model for this business case.
Starting a Hibernate project
43
Creating the domain model Hibernate applications define persistent classes that are mapped to database tables. You define these classes based on your analysis of the business domain; hence, they’re a model of the domain. The “Hello World” example consists of one class and its mapping. Let’s see what a simple persistent class looks like, how the mapping is created, and some of the things you can do with instances of the persistent class in Hibernate. The objective of this example is to store messages in a database and retrieve them for display. Your application has a simple persistent class, Message, which represents these printable messages. The Message class is shown in listing 2.1.
Listing 2.1 Message.java: a simple persistent class
package hello;
Identifier public class Message { attribute private Long id; private String text; private Message nextMessage;
Message() {} public Message(String text) { this.text = text; } public Long getId() { return id; } private void setId(Long id) { this.id = id; } public String getText() { return text; } public void setText(String text) { this.text = text; }
Message text Reference to another Message instance
public Message getNextMessage() { return nextMessage; } public void setNextMessage(Message nextMessage) { this.nextMessage = nextMessage; } }
44
CHAPTER 2
Starting a project
The Message class has three attributes: the identifier attribute, the text of the message, and a reference to another Message object. The identifier attribute allows the application to access the database identity—the primary key value—of a persistent object. If two instances of Message have the same identifier value, they represent the same row in the database. This example uses Long for the type of the identifier attribute, but this isn’t a requirement. Hibernate allows virtually anything for the identifier type, as you’ll see later. You may have noticed that all attributes of the Message class have JavaBeansstyle property accessor methods. The class also has a constructor with no parameters. The persistent classes we show in the examples will almost always look something like this. The no-argument constructor is a requirement (tools like Hibernate use reflection on this constructor to instantiate objects). Instances of the Message class can be managed (made persistent) by Hibernate, but they don’t have to be. Because the Message object doesn’t implement any Hibernate-specific classes or interfaces, you can use it just like any other Java class:
Message message = new Message("Hello World"); System.out.println( message.getText() );
This code fragment does exactly what you’ve come to expect from “Hello World” applications: It prints Hello World to the console. It may look like we’re trying to be cute here; in fact, we’re demonstrating an important feature that distinguishes Hibernate from some other persistence solutions. The persistent class can be used in any execution context at all—no special container is needed. Note that this is also one of the benefits of the new JPA entities, which are also plain Java objects. Save the code for the Message class into your source folder, in a directory and package named hello. Mapping the class to a database schema To allow the object/relational mapping magic to occur, Hibernate needs some more information about exactly how the Message class should be made persistent. In other words, Hibernate needs to know how instances of that class are supposed to be stored and loaded. This metadata can be written into an XML mapping document, which defines, among other things, how properties of the Message class map to columns of a MESSAGES table. Let’s look at the mapping document in listing 2.2.
Starting a Hibernate project
45
Listing 2.2
A simple Hibernate XML mapping
The mapping document tells Hibernate that the Message class is to be persisted to the MESSAGES table, that the identifier property maps to a column named MESSAGE_ID, that the text property maps to a column named MESSAGE_TEXT, and that the property named nextMessage is an association with many-to-one multiplicity that maps to a foreign key column named NEXT_MESSAGE_ID. Hibernate also generates the database schema for you and adds a foreign key constraint with the name FK_NEXT_MESSAGE to the database catalog. (Don’t worry about the other details for now.) The XML document isn’t difficult to understand. You can easily write and maintain it by hand. Later, we discuss a way of using annotations directly in the source code to define mapping information; but whichever method you choose,
46
CHAPTER 2
Starting a project
Hibernate has enough information to generate all the SQL statements needed to insert, update, delete, and retrieve instances of the Message class. You no longer need to write these SQL statements by hand. Create a file named Message.hbm.xml with the content shown in listing 2.2, and place it next to your Message.java file in the source package hello. The hbm suffix is a naming convention accepted by the Hibernate community, and most developers prefer to place mapping files next to the source code of their domain classes. Let’s load and store some objects in the main code of the “Hello World” application. Storing and loading objects What you really came here to see is Hibernate, so let’s save a new Message to the database (see listing 2.3).
Listing 2.3 The “Hello World” main application code
package hello; import java.util.*; import org.hibernate.*; import persistence.*; public class HelloWorld { public static void main(String[] args) { // First unit of work Session session = HibernateUtil.getSessionFactory().openSession(); Transaction tx = session.beginTransaction(); Message message = new Message("Hello World"); Long msgId = (Long) session.save(message); tx.commit(); session.close(); // Second unit of work Session newSession = HibernateUtil.getSessionFactory().openSession(); Transaction newTransaction = newSession.beginTransaction(); List messages = newSession.createQuery("from Message m order by ➥ m.text asc").list(); System.out.println( messages.size() + " message(s) found:" );
Starting a Hibernate project
47
for ( Iterator iter = messages.iterator(); iter.hasNext(); ) { Message loadedMsg = (Message) iter.next(); System.out.println( loadedMsg.getText() ); } newTransaction.commit(); newSession.close(); // Shutting down the application HibernateUtil.shutdown(); } }
Place this code in the file HelloWorld.java in the source folder of your project, in the hello package. Let’s walk through the code. The class has a standard Java main() method, and you can call it from the command line directly. Inside the main application code, you execute two separate units of work with Hibernate. The first unit stores a new Message object, and the second unit loads all objects and prints their text to the console. You call the Hibernate Session, Transaction, and Query interfaces to access the database:
■
Session—A Hibernate Session is many things in one. It’s a single-threaded
nonshared object that represents a particular unit of work with the database. It has the persistence manager API you call to load and store objects. (The Session internals consist of a queue of SQL statements that need to be synchronized with the database at some point and a map of managed persistence instances that are monitored by the Session.)
■
Transaction—This Hibernate API can be used to set transaction bound-
aries programmatically, but it’s optional (transaction boundaries aren’t). Other choices are JDBC transaction demarcation, the JTA interface, or container-managed transactions with EJBs.
■
Query—A database query can be written in Hibernate’s own object-oriented query language (HQL) or plain SQL. This interface allows you to create queries, bind arguments to placeholders in the query, and execute the query in various ways.
Ignore the line of code that calls HibernateUtil.getSessionFactory()—we’ll get to it soon.
48
CHAPTER 2
Starting a project
The first unit of work, if run, results in the execution of something similar to the following SQL:
insert into MESSAGES (MESSAGE_ID, MESSAGE_TEXT, NEXT_MESSAGE_ID) values (1, 'Hello World', null)
Hold on—the MESSAGE_ID column is being initialized to a strange value. You didn’t set the id property of message anywhere, so you expect it to be NULL, right? Actually, the id property is special. It’s an identifier property: It holds a generated unique value. The value is assigned to the Message instance by Hibernate when save() is called. (We’ll discuss how the value is generated later.) Look at the second unit of work. The literal string "from Message m order by m.text asc" is a Hibernate query, expressed in HQL. This query is internally translated into the following SQL when list() is called:
select m.MESSAGE_ID, m.MESSAGE_TEXT, m.NEXT_MESSAGE_ID from MESSAGES m order by m.MESSAGE_TEXT asc
If you run this main() method (don’t try this now—you still need to configure Hibernate), the output on your console is as follows:
1 message(s) found: Hello World
If you’ve never used an ORM tool like Hibernate before, you probably expected to see the SQL statements somewhere in the code or mapping metadata, but they aren’t there. All SQL is generated at runtime (actually, at startup for all reusable SQL statements). Your next step would normally be configuring Hibernate. However, if you feel confident, you can add two other Hibernate features—automatic dirty checking and cascading—in a third unit of work by adding the following code to your main application:
// Third unit of work Session thirdSession = HibernateUtil.getSessionFactory().openSession(); Transaction thirdTransaction = thirdSession.beginTransaction(); // msgId holds the identifier value of the first message message = (Message) thirdSession.get( Message.class, msgId ); message.setText( "Greetings Earthling" ); message.setNextMessage( new Message( "Take me to your leader (please)" ) ); thirdTransaction.commit(); thirdSession.close();
Starting a Hibernate project
49
This code calls three SQL statements inside the same database transaction:
select m.MESSAGE_ID, m.MESSAGE_TEXT, m.NEXT_MESSAGE_ID from MESSAGES m where m.MESSAGE_ID = 1 insert into MESSAGES (MESSAGE_ID, MESSAGE_TEXT, NEXT_MESSAGE_ID) values (2, 'Take me to your leader (please)', null) update MESSAGES set MESSAGE_TEXT = 'Greetings Earthling', NEXT_MESSAGE_ID = 2 where MESSAGE_ID = 1
Notice how Hibernate detected the modification to the text and nextMessage properties of the first message and automatically updated the database—Hibernate did automatic dirty checking. This feature saves you the effort of explicitly asking Hibernate to update the database when you modify the state of an object inside a unit of work. Similarly, the new message was made persistent when a reference was created from the first message. This feature is called cascading save. It saves you the effort of explicitly making the new object persistent by calling save(), as long as it’s reachable by an already persistent instance. Also notice that the ordering of the SQL statements isn’t the same as the order in which you set property values. Hibernate uses a sophisticated algorithm to determine an efficient ordering that avoids database foreign key constraint violations but is still sufficiently predictable to the user. This feature is called transactional write-behind. If you ran the application now, you’d get the following output (you’d have to copy the second unit of work after the third to execute the query-display step again):
2 message(s) found: Greetings Earthling Take me to your leader (please)
You now have domain classes, an XML mapping file, and the “Hello World” application code that loads and stores objects. Before you can compile and run this code, you need to create Hibernate’s configuration (and resolve the mystery of the HibernateUtil class).
2.1.3
Hibernate configuration and startup
The regular way of initializing Hibernate is to build a SessionFactory object from a Configuration object. If you like, you can think of the Configuration as an object representation of a configuration file (or a properties file) for Hibernate. Let’s look at some variations before we wrap it up in the HibernateUtil class.
50
CHAPTER 2
Starting a project
Building a SessionFactory This is an example of a typical Hibernate startup procedure, in one line of code, using automatic configuration file detection:
SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory();
Wait—how did Hibernate know where the configuration file was located and which one to load? When new Configuration() is called, Hibernate searches for a file named hibernate.properties in the root of the classpath. If it’s found, all hibernate.* properties are loaded and added to the Configuration object. When configure() is called, Hibernate searches for a file named hibernate.cfg.xml in the root of the classpath, and an exception is thrown if it can’t be found. You don’t have to call this method if you don’t have this configuration file, of course. If settings in the XML configuration file are duplicates of properties set earlier, the XML settings override the previous ones. The location of the hibernate.properties configuration file is always the root of the classpath, outside of any package. If you wish to use a different file or to have Hibernate look in a subdirectory of your classpath for the XML configuration file, you must pass a path as an argument of the configure() method:
SessionFactory sessionFactory = new Configuration() .configure("/persistence/auction.cfg.xml") .buildSessionFactory();
Finally, you can always set additional configuration options or mapping file locations on the Configuration object programmatically, before building the SessionFactory:
SessionFactory sessionFactory = new Configuration() .configure("/persistence/auction.cfg.xml") .setProperty(Environment.DEFAULT_SCHEMA, "CAVEATEMPTOR") .addResource("auction/CreditCard.hbm.xml") .buildSessionFactory();
Many sources for the configuration are applied here: First the hibernate.properties file in your classpath is read (if present). Next, all settings from /persistence/ auction.cfg.xml are added and override any previously applied settings. Finally, an additional configuration property (a default database schema name) is set programmatically, and an additional Hibernate XML mapping metadata file is added to the configuration. You can, of course, set all options programmatically, or switch between different XML configuration files for different deployment databases. There is effectively no
Starting a Hibernate project
51
limitation on how you can configure and deploy Hibernate; in the end, you only need to build a SessionFactory from a prepared configuration.
NOTE
Method chaining—Method chaining is a programming style supported by many Hibernate interfaces. This style is more popular in Smalltalk than in Java and is considered by some people to be less readable and more difficult to debug than the more accepted Java style. However, it’s convenient in many cases, such as for the configuration snippets you’ve seen in this section. Here is how it works: Most Java developers declare setter or adder methods to be of type void, meaning they return no value; but in Smalltalk, which has no void type, setter or adder methods usually return the receiving object. We use this Smalltalk style in some code examples, but if you don’t like it, you don’t need to use it. If you do use this coding style, it’s better to write each method invocation on a different line. Otherwise, it may be difficult to step through the code in your debugger.
Now that you know how Hibernate is started and how to build a SessionFactory, what to do next? You have to create a configuration file for Hibernate. Creating an XML configuration file Let’s assume you want to keep things simple, and, like most users, you decide to use a single XML configuration file for Hibernate that contains all the configuration details. We recommend that you give your new configuration file the default name hibernate.cfg.xml and place it directly in the source directory of your project, outside of any package. That way, it will end up in the root of your classpath after compilation, and Hibernate will find it automatically. Look at the file in listing 2.4.
Listing 2.4 A simple Hibernate XML configuration file
org.hsqldb.jdbcDriver jdbc:hsqldb:hsql://localhost sa
52
CHAPTER 2
Starting a project
org.hibernate.dialect.HSQLDialect 5 20 300 50 3000 true true
The document type declaration is used by the XML parser to validate this document against the Hibernate configuration DTD. Note that this isn’t the same DTD as the one for Hibernate XML mapping files. Also note that we added some line breaks in the property values to make this more readable—you shouldn’t do this in your real configuration file (unless your database username contains a line break). First in the configuration file are the database connection settings. You need to tell Hibernate which database JDBC driver you’re using and how to connect to the database with a URL, a username, and a password (the password here is omitted, because HSQLDB by default doesn’t require one). You set a Dialect, so that Hibernate knows which SQL variation it has to generate to talk to your database; dozens of dialects are packaged with Hibernate—look at the Hibernate API documentation to get a list. In the XML configuration file, Hibernate properties may be specified without the hibernate prefix, so you can write either hibernate.show_sql or just show_sql. Property names and values are otherwise identical to programmatic configuration properties—that is, to the constants as defined in org.hibernate.cfg.Environment. The hibernate.connection.driver_class property, for example, has the constant Environment.DRIVER. Before we look at some important configuration options, consider the last line in the configuration that names a Hibernate XML mapping file. The Configuration object needs to know about all your XML mapping files before you build the SessionFactory. A SessionFactory is an object that represents a particular
Starting a Hibernate project
53
Hibernate configuration for a particular set of mapping metadata. You can either list all your XML mapping files in the Hibernate XML configuration file, or you can set their names and paths programmatically on the Configuration object. In any case, if you list them as a resource, the path to the mapping files is the relative location on the classpath, with, in this example, hello being a package in the root of the classpath. You also enabled printing of all SQL executed by Hibernate to the console, and you told Hibernate to format it nicely so that you can check what is going on behind the scenes. We’ll come back to logging later in this chapter. Another, sometimes useful, trick is to make configuration options more dynamic with system properties:
... ${displaysql} ...
You can now specify a system property, such as with java -displaysql=true, on the command line when you start your application, and this will automatically be applied to the Hibernate configuration property. The database connection pool settings deserve extra attention. The database connection pool Generally, it isn’t advisable to create a connection each time you want to interact with the database. Instead, Java applications should use a pool of connections. Each application thread that needs to do work on the database requests a connection from the pool and then returns it to the pool when all SQL operations have been executed. The pool maintains the connections and minimizes the cost of opening and closing connections. There are three reasons for using a pool:
■
Acquiring a new connection is expensive. Some database management systems even start a completely new server process for each connection. Maintaining many idle connections is expensive for a database management system, and the pool can optimize the usage of idle connections (or disconnect if there are no requests). Creating prepared statements is also expensive for some drivers, and the connection pool can cache statements for a connection across requests.
■
■
Figure 2.2 shows the role of a connection pool in an unmanaged application runtime environment (that is, one without any application server).
54
CHAPTER 2
Starting a project
Nonmanaged JSE environment
main()
Figure 2.2
JDBC connection pooling in a nonmanaged environment
With no application server to provide a connection pool, an application either implements its own pooling algorithm or relies on a third-party library such as the open source C3P0 connection pooling software. Without Hibernate, the application code calls the connection pool to obtain a JDBC connection and then executes SQL statements with the JDBC programming interface. When the application closes the SQL statements and finally closes the connection, the prepared statements and connection aren’t destroyed, but are returned to the pool. With Hibernate, the picture changes: It acts as a client of the JDBC connection pool, as shown in figure 2.3. The application code uses the Hibernate Session and Query API for persistence operations, and it manages database transactions (probably) with the Hibernate Transaction API. Hibernate defines a plug-in architecture that allows integration with any connection-pooling software. However, support for C3P0 is built in, and the software comes bundled with Hibernate, so you’ll use that (you already copied the c3p0.jar file into your library directory, right?). Hibernate maintains the pool for you, and configuration properties are passed through. How do you configure C3P0 through Hibernate?
Nonmanaged JSE environment
main()
Figure 2.3
Hibernate with a connection pool in a nonmanaged environment
Starting a Hibernate project
55
One way to configure the connection pool is to put the settings into your hibernate.cfg.xml configuration file, like you did in the previous section. Alternatively, you can create a hibernate.properties file in the classpath root of the application. An example of a hibernate.properties file for C3P0 is shown in listing 2.5. Note that this file, with the exception of a list of mapping resources, is equivalent to the configuration shown in listing 2.4.
Listing 2.5 Using hibernate.properties for C3P0 connection pool settings
hibernate.connection.driver_class = org.hsqldb.jdbcDriver hibernate.connection.url = jdbc:hsqldb:hsql://localhost hibernate.connection.username = sa hibernate.dialect = org.hibernate.dialect.HSQLDialect
B
hibernate.c3p0.min_size = 5 hibernate.c3p0.max_size = 20 hibernate.c3p0.timeout = 300 hibernate.c3p0.max_statements = 50 hibernate.c3p0.idle_test_period = 3000
C
D
E F
hibernate.show_sql = true hibernate.format_sql = true
B C D E F
This is the minimum number of JDBC connections that C3P0 keeps ready at all times. This is the maximum number of connections in the pool. An exception is thrown at runtime if this number is exhausted. You specify the timeout period (in this case, 300 seconds) after which an idle connection is removed from the pool. A maximum of 50 prepared statements will be cached. Caching of prepared statements is essential for best performance with Hibernate. This is the idle time in seconds before a connection is automatically validated. Specifying properties of the form hibernate.c3p0.* selects C3P0 as the connection pool (the c3p0.max_size option is needed—you don’t need any other switch to enable C3P0 support). C3P0 has more features than shown in the previous example; refer to the properties file in the etc/ subdirectory of the Hibernate distribution to get a comprehensive example you can copy from. The Javadoc for the class org.hibernate.cfg.Environment also documents every Hibernate configuration property. Furthermore, you can find an up-to-date table with all Hibernate configuration options in the Hibernate reference
56
CHAPTER 2
Starting a project
documentation. We’ll explain the most important settings throughout the book, however. You already know all you need to get started.
FAQ
Can I supply my own connections? Implement the org.hibernate.connection.ConnectionProvider interface, and name your implementation with the hibernate.connection.provider_class configuration option. Hibernate will now rely on your custom provider if it needs a database connection.
Now that you’ve completed the Hibernate configuration file, you can move on and create the SessionFactory in your application. Handling the SessionFactory In most Hibernate applications, the SessionFactory should be instantiated once during application initialization. The single instance should then be used by all code in a particular process, and any Session should be created using this single SessionFactory. The SessionFactory is thread-safe and can be shared; a Session is a single-threaded object. A frequently asked question is where the factory should be stored after creation and how it can be accessed without much hassle. There are more advanced but comfortable options such as JNDI and JMX, but they’re usually available only in full Java EE application servers. Instead, we’ll introduce a pragmatic and quick solution that solves both the problem of Hibernate startup (the one line of code) and the storing and accessing of the SessionFactory: you’ll use a static global variable and static initialization. Both the variable and initialization can be implemented in a single class, which you’ll call HibernateUtil. This helper class is well known in the Hibernate community—it’s a common pattern for Hibernate startup in plain Java applications without Java EE services. A basic implementation is shown in listing 2.6.
Listing 2.6 The HibernateUtil class for startup and SessionFactory handling
package persistence; import org.hibernate.*; import org.hibernate.cfg.*; public class HibernateUtil { private static SessionFactory sessionFactory; static { try { sessionFactory=new Configuration() .configure()
Starting a Hibernate project
57
.buildSessionFactory(); } catch (Throwable ex) { throw new ExceptionInInitializerError(ex); } } public static SessionFactory getSessionFactory() { // Alternatively, you could look up in JNDI here return sessionFactory; } public static void shutdown() { // Close caches and connection pools getSessionFactory().close(); } }
You create a static initializer block to start up Hibernate; this block is executed by the loader of this class exactly once, on initialization when the class is loaded. The first call of HibernateUtil in the application loads the class, builds the SessionFactory, and sets the static variable at the same time. If a problem occurs, any Exception or Error is wrapped and thrown out of the static block (that’s why you catch Throwable). The wrapping in ExceptionInInitializerError is mandatory for static initializers. You’ve created this new class in a new package called persistence. In a fully featured Hibernate application, you often need such a package—for example, to wrap up your custom persistence layer interceptors and data type converters as part of your infrastructure. Now, whenever you need access to a Hibernate Session in your application, you can get it easily with HibernateUtil.getSessionFactory().openSession(), just as you did earlier in the HelloWorld main application code. You’re almost ready to run and test the application. But because you certainly want to know what is going on behind the scenes, you’ll first enable logging. Enabling logging and statistics You’ve already seen the hibernate.show_sql configuration property. You’ll need it continually when you develop software with Hibernate; it enables logging of all generated SQL to the console. You’ll use it for troubleshooting, for performance tuning, and to see what’s going on. If you also enable hibernate.format_sql, the output is more readable but takes up more screen space. A third option you haven’t set so far is hibernate.use_sql_comments—it causes Hibernate to put
58
CHAPTER 2
Starting a project
comments inside all generated SQL statements to hint at their origin. For example, you can then easily see if a particular SQL statement was generated from an explicit query or an on-demand collection initialization. Enabling the SQL output to stdout is only your first logging option. Hibernate (and many other ORM implementations) execute SQL statements asynchronously. An INSERT statement isn’t usually executed when the application calls session.save(), nor is an UPDATE immediately issued when the application calls item.setPrice(). Instead, the SQL statements are usually issued at the end of a transaction. This means that tracing and debugging ORM code is sometimes nontrivial. In theory, it’s possible for the application to treat Hibernate as a black box and ignore this behavior. However, when you’re troubleshooting a difficult problem, you need to be able to see exactly what is going on inside Hibernate. Because Hibernate is open source, you can easily step into the Hibernate code, and occasionally this helps a great deal! Seasoned Hibernate experts debug problems by looking at the Hibernate log and the mapping files only; we encourage you to spend some time with the log output generated by Hibernate and familiarize yourself with the internals. Hibernate logs all interesting events through Apache commons-logging, a thin abstraction layer that directs output to either Apache Log4j (if you put log4j.jar in your classpath) or JDK 1.4 logging (if you’re running under JDK 1.4 or above and Log4j isn’t present). We recommend Log4j because it’s more mature, more popular, and under more active development. To see output from Log4j, you need a file named log4j.properties in your classpath (right next to hibernate.properties or hibernate.cfg.xml). Also, don’t forget to copy the log4j.jar library to your lib directory. The Log4j configuration example in listing 2.7 directs all log messages to the console.
Listing 2.7 An example log4j.properties configuration file
# Direct log messages to stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.Target=System.out log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d{ABSOLUTE} ➥%5p %c{1}:%L - %m%n # Root logger option log4j.rootLogger=INFO, stdout # Hibernate logging options (INFO only shows startup messages) log4j.logger.org.hibernate=INFO
Starting a Hibernate project
59
# Log JDBC bind parameter runtime arguments log4j.logger.org.hibernate.type=INFO
The last category in this configuration file is especially interesting: It enables the logging of JDBC bind parameters if you set it to DEBUG level, providing information you usually don’t see in the ad hoc SQL console log. For a more comprehensive example, check the log4j.properties file bundled in the etc/ directory of the Hibernate distribution, and also look at the Log4j documentation for more information. Note that you should never log anything at DEBUG level in production, because doing so can seriously impact the performance of your application. You can also monitor Hibernate by enabling live statistics. Without an application server (that is, if you don’t have a JMX deployment environment), the easiest way to get statistics out of the Hibernate engine at runtime is the SessionFactory:
Statistics stats = HibernateUtil.getSessionFactory().getStatistics(); stats.setStatisticsEnabled(true); ... stats.getSessionOpenCount(); stats.logSummary(); EntityStatistics itemStats = stats.getEntityStatistics("auction.model.Item"); itemStats.getFetchCount();
The statistics interfaces are Statistics for global information, EntityStatistics for information about a particular entity, CollectionStatistics for a particular collection role, QueryStatistics for SQL and HQL queries, and SecondLevelCacheStatistics for detailed runtime information about a particular region in the optional second-level data cache. A convenient method is logSummary(), which prints out a complete summary to the console with a single call. If you want to enable the collection of statistics through the configuration, and not programmatically, set the hibernate.generate_statistics configuration property to true. See the API documentation for more information about the various statistics retrieval methods. Before you run the “Hello World” application, check that your work directory has all the necessary files:
WORKDIR build.xml +lib
60
CHAPTER 2
Starting a project
+src +hello HelloWorld.java Message.java Message.hbm.xml +persistence HibernateUtil.java hibernate.cfg.xml (or hibernate.properties) log4j.properties
The first file, build.xml, is the Ant build definition. It contains the Ant targets for building and running the application, which we’ll discuss next. You’ll also add a target that can generate the database schema automatically.
2.1.4
Running and testing the application
To run the application, you need to compile it first and start the database management system with the right database schema. Ant is a powerful build system for Java. Typically, you’d write a build.xml file for your project and call the build targets you defined in this file with the Ant command-line tool. You can also call Ant targets from your Java IDE, if that is supported. Compiling the project with Ant You’ll now add a build.xml file and some targets to the “Hello World” project. The initial content for the build file is shown in listing 2.8—you create this file directly in your WORKDIR.
Listing 2.8 A basic Ant build file for “Hello World”
value="src"/> value="lib"/> value="bin"/>
Starting a Hibernate project
61
The first half of this Ant build file contains property settings, such as the project name and global locations of files and directories. You can already see that this build is based on the existing directory layout, your WORKDIR (for Ant, this is the same directory as the basedir). The default target, when this build file is called with no named target, is compile.
62
CHAPTER 2
Starting a project
Next, a name that can be easily referenced later, project.classpath, is defined as a shortcut to all libraries in the library directory of the project. Another shortcut for a pattern that will come in handy is defined as meta.files. You need to handle configuration and metadata files separately in the processing of the build, using this filter. The clean target removes all created and compiled files, and cleans the project. The last three targets, compile, copymetafiles, and run, should be selfexplanatory. Running the application depends on the compilation of all Java source files, and the copying of all mapping and property configuration files to the build directory. Now, execute ant compile in your WORKDIR to compile the “Hello World” application. You should see no errors (nor any warnings) during compilation and find your compiled class files in the bin directory. Also call ant copymetafiles once, and check whether all configuration and mapping files are copied correctly into the bin directory. Before you run the application, start the database management system and export a fresh database schema. Starting the HSQL database system Hibernate supports more than 25 SQL database management systems out of the box, and support for any unknown dialect can be added easily. If you have an existing database, or if you know basic database administration, you can also replace the configuration options (mostly connection and dialect settings) you created earlier with settings for your own preferred system. To say hello to the world, you need a lightweight, no-frills database system that is easy to install and configure. A good choice is HSQLDB, an open source SQL database management system written in Java. It can run in-process with the main application, but in our experience, running it stand-alone with a TCP port listening for connections is usually more convenient. You’ve already copied the hsqldb.jar file into the library directory of your WORKDIR—this library includes both the database engine and the JDBC driver required to connect to a running instance. To start the HSQLDB server, open up a command line, change into your WORKDIR, and run the command shown in figure 2.4. You should see startup messages and finally a help message that tells you how to shut down the database system (it’s OK to use Ctrl+C). You’ll also find some new files in your WORKDIR, starting with test—these are the files used by HSQLDB to store your data. If you want to start with a fresh database, delete the files between restarts of the server.
Starting a Hibernate project
63
Figure 2.4
Starting the HSQLDB server from the command line
You now have an empty database that has no content, not even a schema. Let’s create the schema next. Exporting the database schema You can create the database schema by hand by writing SQL DDL with CREATE statements and executing this DDL on your database. Or (and this is much more convenient) you can let Hibernate take care of this and create a default schema for your application. The prerequisite in Hibernate for automatic generation of SQL DDL is always a Hibernate mapping metadata definition, either in XML mapping files or in Java source-code annotations. We assume that you’ve designed and implemented your domain model classes and written mapping metadata in XML as you followed the previous sections. The tool used for schema generation is hbm2ddl; its class is org.hibernate. tool.hbm2ddl.SchemaExport, so it’s also sometimes called SchemaExport. There are many ways to run this tool and create a schema:
■ ■
You can run in an Ant target in your regular build procedure. You can run SchemaExport programmatically in application code, maybe in your HibernateUtil startup class. This isn’t common, however, because you rarely need programmatic control over schema generation. You can enable automatic export of a schema when your SessionFactory is built by setting the hibernate.hbm2ddl.auto configuration property to create or create-drop. The first setting results in DROP statements followed by CREATE statements when the SessionFactory is built. The second setting adds additional DROP statements when the application is shut down and the SessionFactory is closed—effectively leaving a clean database after every run.
■
64
CHAPTER 2
Starting a project
Programmatic schema generation is straightforward:
Configuration cfg = new Configuration().configure(); SchemaExport schemaExport = new SchemaExport(cfg); schemaExport.create(false, true);
A new SchemaExport object is created from a Configuration; all settings (such as the database driver, connection URL, and so on) are passed to the SchemaExport constructor. The create(false, true) call triggers the DDL generation process, without any SQL printed to stdout (because of the false setting), but with DDL immediately executed in the database (true). See the SchemaExport API for more information and additional settings. Your development process determines whether you should enable automatic schema export with the hibernate.hbm2ddl.auto configuration setting. Many new Hibernate users find the automatic dropping and re-creation on SessionFactory build a little confusing. Once you’re more familiar with Hibernate, we encourage you to explore this option for fast turnaround times in integration testing. An additional option for this configuration property, update, can be useful during development: it enables the built-in SchemaUpdate tool, which can make schema evolution easier. If enabled, Hibernate reads the JDBC database metadata on startup and creates new tables and constraints by comparing the old schema with the current mapping metadata. Note that this functionality depends on the quality of the metadata provided by the JDBC driver, an area in which many drivers are lacking. In practice, this feature is therefore less exciting and useful than it sounds.
WARNING
We’ve seen Hibernate users trying to use SchemaUpdate to update the schema of a production database automatically. This can quickly end in disaster and won’t be allowed by your DBA.
You can also run SchemaUpdate programmatically:
Configuration cfg = new Configuration().configure(); SchemaUpdate schemaUpdate = new SchemaUpdate(cfg); schemaUpdate.execute(false);
The false setting at the end again disables printing of the SQL DDL to the console and only executes the statements directly on the database. If you export the DDL to the console or a text file, your DBA may be able to use it as a starting point to produce a quality schema-evolution script. Another hbm2ddl.auto setting useful in development is validate. It enables SchemaValidator to run at startup. This tool can compare your mapping against
Starting a Hibernate project
65
the JDBC metadata and tell you if the schema and mappings match. You can also run SchemaValidator programmatically:
Configuration cfg = new Configuration().configure(); new SchemaValidator(cfg).validate();
An exception is thrown if a mismatch between the mappings and the database schema is detected. Because you’re basing your build system on Ant, you’ll ideally add a schemaexport target to your Ant build that generates and exports a fresh schema for your database whenever you need one (see listing 2.9).
Listing 2.9 Ant target for schema export
In this target, you first define a new Ant task that you’d like to use, HibernateToolTask. This is a generic task that can do many things—exporting an SQL DDL schema from Hibernate mapping metadata is only one of them. You’ll use it throughout this chapter in all Ant builds. Make sure you include all Hibernate libraries, required third-party libraries, and your JDBC driver in the classpath of the task definition. You also need to add the hibernate-tools.jar file, which can be found in the Hibernate Tools download package.
66
CHAPTER 2
Starting a project
The schemaexport Ant target uses this task, and it also depends on the compiled classes and copied configuration files in the build directory. The basic use of the task is always the same: A configuration is the starting point for all code artifact generation. The variation shown here, , understands Hibernate XML configuration files and reads all Hibernate XML mapping metadata files listed in the given configuration. From that information, an internal Hibernate metadata model (which is what hbm stands for everywhere) is produced, and this model data is then processed subsequently by exporters. We discuss tool configurations that can read annotations or a database for reverse engineering later in this chapter. The other element in the target is a so-called exporter. The tool configuration feeds its metadata information to the exporter you selected; in the preceding example, it’s the exporter. As you may have guessed, this exporter understands the Hibernate metadata model and produces SQL DDL. You can control the DDL generation with several options:
■
The exporter generates SQL, so it’s mandatory that you set an SQL dialect in your Hibernate configuration file. If drop is set to true, SQL DROP statements will be generated first, and all tables and constraints are removed if they exist. If create is set to true, SQL CREATE statements are generated next, to create all tables and constraints. If you enable both options, you effectively drop and re-create the database schema on every run of the Ant target. If export is set to true, all DDL statements are directly executed in the database. The exporter opens a connection to the database using the connection settings found in your configuration file. If an outputfilename is present, all DDL statements are written to this file, and the file is saved in the destdir you configured. The delimiter character is appended to all SQL statements written to the file, and if format is enabled, all SQL statements are nicely indented.
■
■
■
You can now generate, print, and directly export the schema to a text file and the database by running ant schemaxport in your WORKDIR. All tables and constraints are dropped and then created again, and you have a fresh database ready. (Ignore any error message that says that a table couldn’t be dropped because it didn’t exist.)
Starting a Hibernate project
67
Check that your database is running and that it has the correct database schema. A useful tool included with HSQLDB is a simple database browser. You can call it with the following Ant target:
You should see the schema shown in figure 2.5 after logging in. Run your application with ant run, and watch the console for Hibernate log output. You should see your messages being stored, loaded, and printed. Fire an SQL query in the HSQLDB browser to check the content of your database directly. You now have a working Hibernate infrastructure and Ant project build. You could skip to the next chapter and continue writing and mapping more complex business classes. However, we recommend that you spend some time with the
Figure 2.5
The HSQLDB browser and SQL console
68
CHAPTER 2
Starting a project
“Hello World” application and extend it with more functionality. You can, for example, try different HQL queries or logging options. Don’t forget that your database system is still running in the background, and that you have to either export a fresh schema or stop it and delete the database files to get a clean and empty database again. In the next section, we walk through the “Hello World” example again, with Java Persistence interfaces and EJB 3.0.
2.2
Starting a Java Persistence project
In the following sections, we show you some of the advantages of JPA and the new EJB 3.0 standard, and how annotations and the standardized programming interfaces can simplify application development, even when compared with Hibernate. Obviously, designing and linking to standardized interfaces is an advantage if you ever need to port or deploy an application on a different runtime environment. Besides portability, though, there are many good reasons to give JPA a closer look. We’ll now guide you through another “Hello World” example, this time with Hibernate Annotations and Hibernate EntityManager. You’ll reuse the basic project infrastructure introduced in the previous section so you can see where JPA differs from Hibernate. After working with annotations and the JPA interfaces, we’ll show how an application integrates and interacts with other managed components—EJBs. We’ll discuss many more application design examples later in the book; however, this first glimpse will let you decide on a particular approach as soon as possible.
2.2.1
Using Hibernate Annotations
Let’s first use Hibernate Annotations to replace the Hibernate XML mapping files with inline metadata. You may want to copy your existing “Hello World” project directory before you make the following changes—you’ll migrate from native Hibernate to standard JPA mappings (and program code later on). Copy the Hibernate Annotations libraries to your WORKDIR/lib directory—see the Hibernate Annotations documentation for a list of required libraries. (At the time of writing, hibernate-annotations.jar and the API stubs in ejb3-persistence.jar were required.) Now delete the src/hello/Message.hbm.xml file. You’ll replace this file with annotations in the src/hello/Message.java class source, as shown in listing 2.10.
Starting a Java Persistence project
69
Listing 2.10 Mapping the Message class with annotations
package hello; import javax.persistence.*; @Entity @Table(name = "MESSAGES") public class Message { @Id @GeneratedValue @Column(name = "MESSAGE_ID") private Long id; @Column(name = "MESSAGE_TEXT") private String text; @ManyToOne(cascade = CascadeType.ALL) @JoinColumn(name = "NEXT_MESSAGE_ID") private Message nextMessage; private Message() {} public Message(String text) { this.text = text; } public Long getId() { return id; } private void setId(Long id) { this.id = id; } public String getText() { return text; } public void setText(String text) { this.text = text; } public Message getNextMessage() { return nextMessage; } public void setNextMessage(Message nextMessage) { this.nextMessage = nextMessage; } }
The first thing you’ll probably notice in this updated business class is the import of the javax.persistence interfaces. Inside this package are all the standardized JPA annotations you need to map the @Entity class to a database @Table. You put
70
CHAPTER 2
Starting a project
annotations on the private fields of the class, starting with @Id and @GeneratedValue for the database identifier mapping. The JPA persistence provider detects that the @Id annotation is on a field and assumes that it should access properties on an object directly through fields at runtime. If you placed the @Id annotation on the getId() method, you’d enable access to properties through getter and setter methods by default. Hence, all other annotations are also placed on either fields or getter methods, following the selected strategy. Note that the @Table, @Column, and @JoinColumn annotations aren’t necessary. All properties of an entity are automatically considered persistent, with default strategies and table/column names. You add them here for clarity and to get the same results as with the XML mapping file. Compare the two mapping metadata strategies now, and you’ll see that annotations are much more convenient and reduce the lines of metadata significantly. Annotations are also type-safe, they support autocompletion in your IDE as you type (like any other Java interfaces), and they make refactoring of classes and properties easier. If you’re worried that the import of the JPA interfaces will bind your code to this package, you should know that it’s only required on your classpath when the annotations are used by Hibernate at runtime. You can load and execute this class without the JPA interfaces on your classpath as long as you don’t want to load and store instances with Hibernate. A second concern that developers new to annotations sometimes have relates to the inclusion of configuration metadata in Java source code. By definition, configuration metadata is metadata that can change for each deployment of the application, such as table names. JPA has a simple solution: You can override or replace all annotated metadata with XML metadata files. Later in the book, we’ll show you how this is done. Let’s assume that this is all you want from JPA—annotations instead of XML. You don’t want to use the JPA programming interfaces or query language; you’ll use Hibernate Session and HQL. The only other change you need to make to your project, besides deleting the now obsolete XML mapping file, is a change in the Hibernate configuration, in hibernate.cfg.xml:
Starting a Java Persistence project
71
The Hibernate configuration file previously had a list of all XML mapping files. This has been replaced with a list of all annotated classes. If you use programmatic configuration of a SessionFactory, the addAnnotatedClass() method replaces the addResource() method:
// Load settings from hibernate.properties AnnotationConfiguration cfg = new AnnotationConfiguration(); // ... set other configuration options programmatically cfg.addAnnotatedClass(hello.Message.class); SessionFactory sessionFactory = cfg.buildSessionFactory();
Note that you have now used AnnotationConfiguration instead of the basic Hibernate Configuration interface—this extension understands annotated classes. At a minimum, you also need to change your initializer in HibernateUtil to use that interface. If you export the database schema with an Ant target, replace with in your build.xml file. This is all you need to change to run the “Hello World” application with annotations. Try running it again, probably with a fresh database. Annotation metadata can also be global, although you don’t need this for the “Hello World” application. Global annotation metadata is placed in a file named package-info.java in a particular package directory. In addition to listing annotated classes, you need to add the packages that contain global metadata to your configuration. For example, in a Hibernate XML configuration file, you need to add the following:
Or you could achieve the same results with programmatic configuration:
72
CHAPTER 2
Starting a project
// Load settings from hibernate.properties AnnotationConfiguration cfg = new AnnotationConfiguration(); // ... set other configuration options programmatically cfg.addClass(hello.Message.class); cfg.addPackage("hello"); SessionFactory sessionFactory = cfg.buildSessionFactory();
Let’s take this one step further and replace the native Hibernate code that loads and stores messages with code that uses JPA. With Hibernate Annotations and Hibernate EntityManager, you can create portable and standards-compliant mappings and data access code.
2.2.2
Using Hibernate EntityManager
Hibernate EntityManager is a wrapper around Hibernate Core that provides the JPA programming interfaces, supports the JPA entity instance lifecycle, and allows you to write queries with the standardized Java Persistence query language. Because JPA functionality is a subset of Hibernate’s native capabilities, you may wonder why you should use the EntityManager package on top of Hibernate. We’ll present a list of advantages later in this section, but you’ll see one particular simplification as soon as you configure your project for Hibernate EntityManager: You no longer have to list all annotated classes (or XML mapping files) in your configuration file. Let’s modify the “Hello World” project and prepare it for full JPA compatibility. Basic JPA configuration A SessionFactory represents a particular logical data-store configuration in a Hibernate application. The EntityManagerFactory has the same role in a JPA application, and you configure an EntityManagerFactory (EMF) either with configuration files or in application code just as you would configure a SessionFactory. The configuration of an EMF, together with a set of mapping metadata (usually annotated classes), is called the persistence unit. The notion of a persistence unit also includes the packaging of the application, but we want to keep this as simple as possible for “Hello World”; we’ll assume that you want to start with a standardized JPA configuration and no special packaging. Not only the content, but also the name and location of the JPA configuration file for a persistence unit are standardized. Create a directory named WORKDIR/etc/META-INF and place the basic configuration file named persistence.xml, shown in listing 2.11, in that directory:
Starting a Java Persistence project
73
Listing 2.11 Persistence unit configuration file
Every persistence unit needs a name, and in this case it’s helloworld.
NOTE
The XML header in the preceding persistence unit configuration file declares what schema should be used, and it’s always the same. We’ll omit it in future examples and assume that you’ll add it.
A persistence unit is further configured with an arbitrary number of properties, which are all vendor-specific. The property in the previous example, hibernate.ejb.cfgfile, acts as a catchall. It refers to a hibernate.cfg.xml file (in the root of the classpath) that contains all settings for this persistence unit—you’re reusing the existing Hibernate configuration. Later, you’ll move all configuration details into the persistence.xml file, but for now you’re more interested in running “Hello World” with JPA. The JPA standard says that the persistence.xml file needs to be present in the META-INF directory of a deployed persistence unit. Because you aren’t really packaging and deploying the persistence unit, this means that you have to copy persistence.xml into a META-INF directory of the build output directory. Modify your build.xml, and add the following to the copymetafiles target:
74
CHAPTER 2
Starting a project
Everything found in WORKDIR/etc that matches the meta.files pattern is copied to the build output directory, which is part of the classpath at runtime. Let’s rewrite the main application code with JPA. “Hello World” with JPA These are your primary programming interfaces in Java Persistence:
■
javax.persistence.Persistence —A startup class that provides a static method for the creation of an EntityManagerFactory. javax.persistence.EntityManagerFactory —The equivalent to a Hibernate SessionFactory. This runtime object represents a particular persis-
■
tence unit. It’s thread-safe, is usually handled as a singleton, and provides methods for the creation of EntityManager instances.
■
javax.persistence.EntityManager —The equivalent to a Hibernate Session. This single-threaded, nonshared object represents a particular unit of
work for data access. It provides methods to manage the lifecycle of entity instances and to create Query instances.
■
javax.persistence.Query —This is the equivalent to a Hibernate Query. An object is a particular JPA query language or native SQL query representation, and it allows safe binding of parameters and provides various methods for the execution of the query. javax.persistence.EntityTransaction —This is the equivalent to a Hibernate Transaction, used in Java SE environments for the demarcation of RESOURCE_LOCAL transactions. In Java EE, you rely on the standardized javax.transaction.UserTransaction interface of JTA for programmatic transaction demarcation.
■
To use the JPA interfaces, you need to copy the required libraries to your WORKDIR/lib directory; check the documentation bundled with Hibernate EntityManager for an up-to-date list. You can then rewrite the code in WORKDIR/ src/hello/HelloWorld.java and switch from Hibernate to JPA interfaces (see listing 2.12).
Starting a Java Persistence project
75
Listing 2.12 The “Hello World” main application code with JPA
package hello; import java.util.*; import javax.persistence.*; public class HelloWorld { public static void main(String[] args) { // Start EntityManagerFactory EntityManagerFactory emf = Persistence.createEntityManagerFactory("helloworld"); // First unit of work EntityManager em = emf.createEntityManager(); EntityTransaction tx = em.getTransaction(); tx.begin(); Message message = new Message("Hello World"); em.persist(message); tx.commit(); em.close(); // Second unit of work EntityManager newEm = emf.createEntityManager(); EntityTransaction newTx = newEm.getTransaction(); newTx.begin(); List messages = newEm .createQuery("select m from Message m ➥ order by m.text asc") .getResultList();
System.out.println( messages.size() + " message(s) found" ); for (Object m : messages) { Message loadedMsg = (Message) m; System.out.println(loadedMsg.getText()); } newTx.commit(); newEm.close(); // Shutting down the application emf.close(); } }
76
CHAPTER 2
Starting a project
The first thing you probably notice in this code is that there is no Hibernate import anymore, only javax.peristence.*. The EntityManagerFactory is created with a static call to Persistence and the name of the persistence unit. The rest of the code should be self-explanatory—you use JPA just like Hibernate, though there are some minor differences in the API, and methods have slightly different names. Furthermore, you didn’t use the HibernateUtil class for static initialization of the infrastructure; you can write a JPAUtil class and move the creation of an EntityManagerFactory there if you want, or you can remove the now unused WORKDIR/src/persistence package. JPA also supports programmatic configuration, with a map of options:
Map myProperties = new HashMap(); myProperties.put("hibernate.hbm2ddl.auto", "create-drop"); EntityManagerFactory emf = Persistence.createEntityManagerFactory("helloworld", myProperties);
Custom programmatic properties override any property you’ve set in the persistence.xml configuration file. Try to run the ported HelloWorld code with a fresh database. You should see the exact same log output on your screen as you did with native Hibernate—the JPA persistence provider engine is Hibernate. Automatic detection of metadata We promised earlier that you won’t have to list all your annotated classes or XML mapping files in the configuration, but it’s still there, in hibernate.cfg.xml. Let’s enable the autodetection feature of JPA. Run the “Hello World” application again after switching to DEBUG logging for the org.hibernate package. Some additional lines should appear in your log:
... Ejb3Configuration:141 - Trying to find persistence unit: helloworld Ejb3Configuration:150 - Analyse of persistence.xml: file:/helloworld/build/META-INF/persistence.xml PersistenceXmlLoader:115 - Persistent Unit name from persistence.xml: helloworld Ejb3Configuration:359 - Detect class: true; detect hbm: true JarVisitor:178 - Searching mapped entities in jar/par: file:/helloworld/build JarVisitor:217 - Filtering: hello.HelloWorld JarVisitor:217 - Filtering: hello.Message
Starting a Java Persistence project
77
JarVisitor:255 - Java element filter matched for hello.Message Ejb3Configuration:101 - Creating Factory: helloworld ...
On startup, the Persistence.createEntityManagerFactory() method tries to locate the persistence unit named helloworld. It searches the classpath for all META-INF/persistence.xml files and then configures the EMF if a match is found. The second part of the log shows something you probably didn’t expect. The JPA persistence provider tried to find all annotated classes and all Hibernate XML mapping files in the build output directory. The list of annotated classes (or the list of XML mapping files) in hibernate.cfg.xml isn’t needed, because hello.Message, the annotated entity class, has already been found. Instead of removing only this single unnecessary option from hibernate.cfg.xml, let’s remove the whole file and move all configuration details into persistence.xml (see listing 2.13).
Listing 2.13 Full persistence unit configuration file
org.hibernate.ejb.HibernatePersistence
78
CHAPTER 2
Starting a project
There are three interesting new elements in this configuration file. First, you set an explicit that should be used for this persistence unit. This is usually required only if you work with several JPA implementations at the same time, but we hope that Hibernate will, of course, be the only one. Next, the specification requires that you list all annotated classes with elements if you deploy in a non-Java EE environment—Hibernate supports autodetection of mapping metadata everywhere, making this optional. Finally, the Hibernate configuration setting archive.autodetection tells Hibernate what metadata to scan for automatically: annotated classes (class) and/or Hibernate XML mapping files (hbm). By default, Hibernate EntityManager scans for both. The rest of the configuration file contains all options we explained and used earlier in this chapter in the regular hibernate.cfg.xml file. Automatic detection of annotated classes and XML mapping files is a great feature of JPA. It’s usually only available in a Java EE application server; at least, this is what the EJB 3.0 specification guarantees. But Hibernate, as a JPA provider, also implements it in plain Java SE, though you may not be able to use the exact same configuration with any other JPA provider. You’ve now created an application that is fully JPA specification-compliant. Your project directory should look like this (note that we also moved log4j.properties to the etc/ directory):
WORKDIR +etc log4j.properties +META-INF persistence.xml +lib +src +hello HelloWorld.java Message.java
All JPA configuration settings are bundled in persistence.xml, all mapping metadata is included in the Java source code of the Message class, and Hibernate
Starting a Java Persistence project
79
automatically scans and finds the metadata on startup. Compared to pure Hibernate, you now have these benefits:
■
Automatic scanning of deployed metadata, an important feature in large projects. Maintaining a list of annotated classes or mapping files becomes difficult if hundreds of entities are developed by a large team. Standardized and simplified configuration, with a standard location for the configuration file, and a deployment concept—the persistence unit—that has many more advantages in larger projects that wrap several units (JARs) in an application archive (EAR). Standardized data access code, entity instance lifecycle, and queries that are fully portable. There is no proprietary import in your application.
■
■
These are only some of the advantages of JPA. You’ll see its real power if you combine it with the full EJB 3.0 programming model and other managed components.
2.2.3
Introducing EJB components
Java Persistence starts to shine when you also work with EJB 3.0 session beans and message-driven beans (and other Java EE 5.0 standards). The EJB 3.0 specification has been designed to permit the integration of persistence, so you can, for example, get automatic transaction demarcation on bean method boundaries, or a persistence context (think Session) that spans the lifecycle of a stateful session EJB. This section will get you started with EJB 3.0 and JPA in a managed Java EE environment; you’ll again modify the “Hello World” application to learn the basics. You need a Java EE environment first—a runtime container that provides Java EE services. There are two ways you can get it:
■
You can install a full Java EE 5.0 application server that supports EJB 3.0 and JPA. Several open source (Sun GlassFish, JBoss AS, ObjectWeb EasyBeans) and other proprietary licensed alternatives are on the market at the time of writing, and probably more will be available when you read this book. You can install a modular server that provides only the services you need, selected from the full Java EE 5.0 bundle. At a minimum, you probably want an EJB 3.0 container, JTA transaction services, and a JNDI registry. At the time of writing, only JBoss AS provided modular Java EE 5.0 services in an easily customizable package.
■
To keep things simple and to show you how easy it is to get started with EJB 3.0, you’ll install and configure the modular JBoss Application Server and enable only the Java EE 5.0 services you need.
80
CHAPTER 2
Starting a project
Installing the EJB container Go to http://jboss.com/products/ejb3, download the modular embeddable server, and unzip the downloaded archive. Copy all libraries that come with the server into your project’s WORKDIR/lib directory, and copy all included configuration files to your WORKDIR/src directory. You should now have the following directory layout:
WORKDIR +etc default.persistence.properties ejb3-interceptors-aop.xml embedded-jboss-beans.xml jndi.properties log4j.properties +META-INF helloworld-beans.xml persistence.xml +lib +src +hello HelloWorld.java Message.java
The JBoss embeddable server relies on Hibernate for Java Persistence, so the default.persistence.properties file contains default settings for Hibernate that are needed for all deployments (such as JTA integration settings). The ejb3-interceptors-aop.xml and embedded-jboss-beans.xml configuration files contain the services configuration of the server—you can look at these files, but you don’t need to modify them now. By default, at the time of writing, the enabled services are JNDI, JCA, JTA, and the EJB 3.0 container—exactly what you need. To migrate the “Hello World” application, you need a managed datasource, which is a database connection that is handled by the embeddable server. The easiest way to configure a managed datasource is to add a configuration file that deploys the datasource as a managed service. Create the file in listing 2.14 as WORKDIR/etc/META-INF/helloworld-beans.xml.
Listing 2.14 Datasource configuration file for the JBoss server
java:/HelloWorldDS org.hsqldb.jdbcDriver jdbc:hsqldb:hsql://localhost sa 0 name="maxSize">10 name="blockingTimeout">1000 name="idleTimeout">100000
Again, the XML header and schema declaration aren’t important for this example. You set up two beans: The first is a factory that can produce the second type of bean. The LocalTxDataSource is effectively now your database connection pool, and all your connection pool settings are available on this factory. The factory binds a managed datasource under the JNDI name java:/HelloWorldDS. The second bean configuration declares how the registered object named HelloWorldDS should be instantiated, if another service looks it up in the JNDI registry. Your “Hello World” application asks for the datasource under this name, and the server calls getDatasource() on the LocalTxDataSource factory to obtain it.
82
CHAPTER 2
Starting a project
Also note that we added some line breaks in the property values to make this more readable—you shouldn’t do this in your real configuration file (unless your database username contains a line break). Configuring the persistence unit Next, you need to change the persistence unit configuration of the “Hello World” application to access a managed JTA datasource, instead of a resource-local connection pool. Change your WORKDIR/etc/META-INF/persistence.xml file as follows:
java:/HelloWorldDS
You removed many Hibernate configuration options that are no longer relevant, such as the connection pool and database connection settings. Instead, you set a property with the name of the datasource as bound in JNDI. Don’t forget that you still need to configure the correct SQL dialect and any other Hibernate options that aren’t present in default.persistence.properties. The installation and configuration of the environment is now complete, (we’ll show you the purpose of the jndi.properties files in a moment) and you can rewrite the application code with EJBs. Writing EJBs There are many ways to design and create an application with managed components. The “Hello World” application isn’t sophisticated enough to show elaborate examples, so we’ll introduce only the most basic type of EJB, a stateless session bean. (You’ve already seen entity classes—annotated plain Java classes that can have persistent instances. Note that the term entity bean only refers to the old EJB 2.1 entity beans; EJB 3.0 and Java Persistence standardize a lightweight programming model for plain entity classes.)
Starting a Java Persistence project
83
Every EJB session bean needs a business interface. This isn’t a special interface that needs to implement predefined methods or extend existing ones; it’s plain Java. Create the following interface in the WORKDIR/src/hello package:
package hello; public interface MessageHandler { public void saveMessages(); public void showMessages(); }
A MessageHandler can save and show messages; it’s straightforward. The actual EJB implements this business interface, which is by default considered a local interface (that is, remote EJB clients cannot call it); see listing 2.15.
Listing 2.15 The “Hello World” EJB session bean application code
package hello; import javax.ejb.Stateless; import javax.persistence.*; import java.util.List; @Stateless public class MessageHandlerBean implements MessageHandler { @PersistenceContext EntityManager em; public void saveMessages() { Message message = new Message("Hello World"); em.persist(message); } public void showMessages() { List messages = em.createQuery("select m from Message m ➥ order by m.text asc") .getResultList(); System.out.println(messages.size() + " message(s) found:"); for (Object m : messages) { Message loadedMsg = (Message) m; System.out.println(loadedMsg.getText()); } } }
84
CHAPTER 2
Starting a project
There are several interesting things to observe in this implementation. First, it’s a plain Java class with no hard dependencies on any other package. It becomes an EJB only with a single metadata annotation, @Stateless. EJBs support containermanaged services, so you can apply the @PersistenceContext annotation, and the server injects a fresh EntityManager instance whenever a method on this stateless bean is called. Each method is also assigned a transaction automatically by the container. The transaction starts when the method is called, and commits when the method returns. (It would be rolled back when an exception is thrown inside the method.) You can now modify the HelloWorld main class and delegate all the work of storing and showing messages to the MessageHandler. Running the application The main class of the “Hello World” application calls the MessageHandler stateless session bean after looking it up in the JNDI registry. Obviously, the managed environment and the whole application server, including the JNDI registry, must be booted first. You do all of this in the main() method of HelloWorld.java (see listing 2.16).
Listing 2.16 “Hello World” main application code, calling EJBs
package hello; import org.jboss.ejb3.embedded.EJB3StandaloneBootstrap; import javax.naming.InitialContext; public class HelloWorld { public static void main(String[] args) throws Exception { // Boot the JBoss Microcontainer with EJB3 settings, automatically // loads ejb3-interceptors-aop.xml and embedded-jboss-beans.xml EJB3StandaloneBootstrap.boot(null); // Deploy custom stateless beans (datasource, mostly) EJB3StandaloneBootstrap .deployXmlResource("META-INF/helloworld-beans.xml"); // Deploy all EJBs found on classpath (slow, scans all) // EJB3StandaloneBootstrap.scanClasspath(); // Deploy all EJBs found on classpath (fast, scans build directory) // This is a relative location, matching the substring end of one // of java.class.path locations. Print out the value of // System.getProperty("java.class.path") to see all paths. EJB3StandaloneBootstrap.scanClasspath("helloworld-ejb3/bin"); // Create InitialContext from jndi.properties
Starting a Java Persistence project
85
InitialContext initialContext = new InitialContext(); // Look up the stateless MessageHandler EJB MessageHandler msgHandler = (MessageHandler) initialContext .lookup("MessageHandlerBean/local"); // Call the stateless EJB msgHandler.saveMessages(); msgHandler.showMessages(); // Shut down EJB container EJB3StandaloneBootstrap.shutdown(); } }
The first command in main() boots the server’s kernel and deploys the base services found in the service configuration files. Next, the datasource factory configuration you created earlier in helloworld-beans.xml is deployed, and the datasource is bound to JNDI by the container. From that point on, the container is ready to deploy EJBs. The easiest (but often not the fastest) way to deploy all EJBs is to let the container search the whole classpath for any class that has an EJB annotation. To learn about the many other deployment options available, check the JBoss AS documentation bundled in the download. To look up an EJB, you need an InitialContext, which is your entry point for the JNDI registry. If you instantiate an InitialContext, Java automatically looks for the file jndi.properties on your classpath. You need to create this file in WORKDIR/ etc with settings that match the JBoss server’s JNDI registry configuration:
java.naming.factory.initial ➥ org.jnp.interfaces.LocalOnlyContextFactory java.naming.factory.url.pkgs org.jboss.naming:org.jnp.interfaces
You don’t need to know exactly what this configuration means, but it basically points your InitialContext to a JNDI registry running in the local virtual machine (remote EJB client calls would require a JNDI service that supports remote communication). By default, you look up the MessageHandler bean by the name of an implementation class, with the /local suffix for a local interface. How EJBs are named, how they’re bound to JNDI, and how you look them up varies and can be customized. These are the defaults for the JBoss server. Finally, you call the MessageHandler EJB and let it do all the work automatically in two units—each method call will result in a separate transaction.
86
CHAPTER 2
Starting a project
This completes our first example with managed EJB components and integrated JPA. You can probably already see how automatic transaction demarcation and EntityManager injection can improve the readability of your code. Later, we’ll show you how stateful session beans can help you implement sophisticated conversations between the user and the application, with transactional semantics. Furthermore, the EJB components don’t contain any unnecessary glue code or infrastructure methods, and they’re fully reusable, portable, and executable in any EJB 3.0 container.
NOTE
Packaging of persistence units —We didn’t talk much about the packaging of persistence units—you didn’t need to package the “Hello World” example for any of the deployments. However, if you want to use features such as hot redeployment on a full application server, you need to package your application correctly. This includes the usual combination of JARs, WARs, EJB-JARs, and EARs. Deployment and packaging is often also vendor-specific, so you should consult the documentation of your application server for more information. JPA persistence units can be scoped to JARs, WARs, and EJB-JARs, which means that one or several of these archives contains all the annotated classes and a META-INF/persistence.xml configuration file with all settings for this particular unit. You can wrap one or several JARs, WARs, and EJB-JARs in a single enterprise application archive, an EAR. Your application server should correctly detect all persistence units and create the necessary factories automatically. With a unit name attribute on the @PersistenceContext annotation, you instruct the container to inject an EntityManager from a particular unit.
Full portability of an application isn’t often a primary reason to use JPA or EJB 3.0. After all, you made a decision to use Hibernate as your JPA persistence provider. Let’s look at how you can fall back and use a Hibernate native feature from time to time.
2.2.4
Switching to Hibernate interfaces
You decided to use Hibernate as a JPA persistence provider for several reasons: First, Hibernate is a good JPA implementation that provides many options that don’t affect your code. For example, you can enable the Hibernate second-level data cache in your JPA configuration, and transparently improve the performance and scalability of your application without touching any code. Second, you can use native Hibernate mappings or APIs when needed. We discuss the mixing of mappings (especially annotations) in chapter 3, section 3.3,
Starting a Java Persistence project
87
“Object/relational mapping metadata,” but here we want to show how you can use a Hibernate API in your JPA application, when needed. Obviously, importing a Hibernate API into your code makes porting the code to a different JPA provider more difficult. Hence, it becomes critically important to isolate these parts of your code properly, or at least to document why and when you used a native Hibernate feature. You can fall back to Hibernate APIs from their equivalent JPA interfaces and get, for example, a Configuration, a SessionFactory, and even a Session whenever needed. For example, instead of creating an EntityManagerFactory with the Persistence static class, you can use a Hibernate Ejb3Configuration:
Ejb3Configuration cfg = new Ejb3Configuration(); EntityManagerFactory emf = cfg.configure("/custom/hibernate.cfg.xml") .setProperty("hibernate.show_sql", "false") .setInterceptor( new MyInterceptor() ) .addAnnotatedClass( hello.Message.class ) .addResource( "/Foo.hbm.xml") .buildEntityManagerFactory(); AnnotationConfiguration hibCfg = cfg.getHibernateConfiguration();
The Ejb3Configuration is a new interface that duplicates the regular Hibernate Configuration instead of extending it (this is an implementation detail). This means you can get a plain AnnotationConfiguration object from an Ejb3Configuration, for example, and pass it to a SchemaExport instance programmatically. The SessionFactory interface is useful if you need programmatic control over the second-level cache regions. You can get a SessionFactory by casting the EntityManagerFactory first:
HibernateEntityManagerFactory hibEMF = (HibernateEntityManagerFactory) emf; SessionFactory sf = hibEMF.getSessionFactory();
The same technique can be applied to get a Session from an EntityManager:
HibernateEntityManager hibEM = (HibernateEntityManager) em; Session session = hibEM.getSession();
This isn’t the only way to get a native API from the standardized EntityManager. The JPA specification supports a getDelegate() method that returns the underlying implementation:
88
CHAPTER 2
Starting a project
Session session = (Session) entityManager.getDelegate();
Or you can get a Session injected into an EJB component (although this only works in the JBoss Application Server):
@Stateless public class MessageHandlerBean implements MessageHandler { @PersistenceContext Session session; ... }
In rare cases, you can fall back to plain JDBC interfaces from the Hibernate Session:
Connection jdbcConnection = session.connection();
This last option comes with some caveats: You aren’t allowed to close the JDBC Connection you get from Hibernate—this happens automatically. The exception to this rule is that in an environment that relies on aggressive connection releases, which means in a JTA or CMT environment, you have to close the returned connection in application code. A better and safer way to access a JDBC connection directly is through resource injection in a Java EE 5.0. Annotate a field or setter method in an EJB, an EJB listener, a servlet, a servlet filter, or even a JavaServer Faces backing bean, like this:
@Resource(mappedName="java:/HelloWorldDS") DataSource ds;
So far, we’ve assumed that you work on a new Hibernate or JPA project that involves no legacy application code or existing database schema. We now switch perspectives and consider a development process that is bottom-up. In such a scenario, you probably want to automatically reverse-engineer artifacts from an existing database schema.
2.3
Reverse engineering a legacy database
Your first step when mapping a legacy database likely involves an automatic reverse-engineering procedure. After all, an entity schema already exists in your database system. To make this easier, Hibernate has a set of tools that can read a schema and produce various artifacts from this metadata, including XML mapping files and Java source code. All of this is template-based, so many customizations are possible. You can control the reverse-engineering process with tools and tasks in your Ant build. The HibernateToolTask you used earlier to export SQL DDL from
Reverse engineering a legacy database
89
Hibernate mapping metadata has many more options, most of which are related to reverse engineering, as to how XML mapping files, Java code, or even whole application skeletons can be generated automatically from an existing database schema. We’ll first show you how to write an Ant target that can load an existing database into a Hibernate metadata model. Next, you’ll apply various exporters and produce XML files, Java code, and other useful artifacts from the database tables and columns.
2.3.1
Creating a database configuration
Let’s assume that you have a new WORKDIR with nothing but the lib directory (and its usual contents) and an empty src directory. To generate mappings and code from an existing database, you first need to create a configuration file that contains your database connection settings:
hibernate.dialect = org.hibernate.dialect.HSQLDialect hibernate.connection.driver_class = org.hsqldb.jdbcDriver hibernate.connection.url = jdbc:hsqldb:hsql://localhost hibernate.connection.username = sa
Store this file directly in WORKDIR, and name it helloworld.db.properties. The four lines shown here are the minimum that is required to connect to the database and read the metadata of all tables and columns. You could have created a Hibernate XML configuration file instead of hibernate.properties, but there is no reason to make this more complex than necessary. Write the Ant target next. In a build.xml file in your project, add the following code:
90
CHAPTER 2
Starting a project
The HibernateToolTask definition for Ant is the same as before. We assume that you’ll reuse most of the build file introduced in previous sections, and that references such as project.classpath are the same. The task is set with WORKDIR/src as the default destination directory for all generated artifacts. A is a Hibernate tool configuration that can connect to a database via JDBC and read the JDBC metadata from the database catalog. You usually configure it with two options: database connection settings (the properties file) and an optional reverse-engineering customization file. The metadata produced by the tool configuration is then fed to exporters. The example Ant target names two such exporters: the hbm2hbmxml exporter, as you can guess from its name, takes Hibernate metadata (hbm) from a configuration, and generates Hibernate XML mapping files; the second exporter can prepare a hibernate.cfg.xml file that lists all the generated XML mapping files. Before we talk about these and various other exporters, let’s spend a minute on the reverse-engineering customization file and what you can do with it.
2.3.2
Customizing reverse engineering
JDBC metadata—that is, the information you can read from a database about itself via JDBC—often isn’t sufficient to create a perfect XML mapping file, let alone Java application code. The opposite may also be true: Your database may contain information that you want to ignore (such as particular tables or columns) or that you wish to transform with nondefault strategies. You can customize the reverseengineering procedure with a reverse-engineering configuration file, which uses an XML syntax. Let’s assume that you’re reverse-engineering the “Hello World” database you created earlier in this chapter, with its single MESSAGES table and only a few columns. With a helloworld.reveng.xml file, as shown in listing 2.17, you can customize this reverse engineering.
Listing 2.17 Configuration for customized reverse engineering
B
C D
E
Reverse engineering a legacy database
91
F
G
B C D
This XML file has its own DTD for validation and autocompletion. A table filter can exclude tables by name with a regular expression. However, in this example, you define a a default package for all classes produced for the tables matching the regular expression. You can customize individual tables by name. The schema name is usually optional, but HSQLDB assigns the PUBLIC schema to all tables by default so this setting is needed to identify the table when the JDBC metadata is retrieved. You can also set a custom class name for the generated entity here. The primary key column generates a property named id, the default would be messageId. You also explicitly declare which Hibernate identifier generator should be used. An individual column can be excluded or, in this case, the name of the generated property can be specified—the default would be messageText. If the foreign key constraint FK_NEXT_MESSAGE is retrieved from JDBC metadata, a many-to-one association is created by default to the target entity of that class. By matching the foreign key constraint by name, you can specify whether an inverse collection (one-to-many) should also be generated (the example excludes this) and what the name of the many-to-one property should be. If you now run the Ant target with this customization, it generates a Message.hbm.xml file in the hello package in your source directory. (You need to copy the Freemarker and jTidy JAR files into your library directory first.) The customizations you made result in the same Hibernate mapping file you wrote earlier by hand, shown in listing 2.2. In addition to the XML mapping file, the Ant target also generates a Hibernate XML configuration file in the source directory:
E F G
92
CHAPTER 2
Starting a project
org.hsqldb.jdbcDriver jdbc:hsqldb:hsql://localhost sa org.hibernate.dialect.HSQLDialect
The exporter writes all the database connection settings you used for reverse engineering into this file, assuming that this is the database you want to connect to when you run the application. It also adds all generated XML mapping files to the configuration. What is your next step? You can start writing the source code for the Message Java class. Or you can let the Hibernate Tools generate the classes of the domain model for you.
2.3.3
Generating Java source code
Let’s assume you have an existing Hibernate XML mapping file for the Message class, and you’d like to generate the source for the class. As discussed in chapter 3, a plain Java entity class ideally implements Serializable, has a no-arguments constructor, has getters and setters for all properties, and has an encapsulated implementation. Source code for entity classes can be generated with the Hibernate Tools and the hbm2java exporter in your Ant build. The source artifact can be anything that can be read into a Hibernate metadata model—Hibernate XML mapping files are best if you want to customize the Java code generation. Add the following target to your Ant build:
Reverse engineering a legacy database
93
The reads all Hibernate XML mapping files, and the exporter produces Java source code with the default strategy. Customizing entity class generation By default, hbm2java generates a simple entity class for each mapped entity. The class implements the Serializable marker interface, and it has accessor methods for all properties and the required constructor. All attributes of the class have private visibility for fields, although you can change that behavior with the element and attributes in the XML mapping files. The first change to the default reverse engineering behavior you make is to restrict the visibility scope for the Message’s attributes. By default, all accessor methods are generated with public visibility. Let’s say that Message objects are immutable; you wouldn’t expose the setter methods on the public interface, but only the getter methods. Instead of enhancing the mapping of each property with a element, you can declare a meta-attribute at the class level, thus applying the setting to all properties in that class:
private ...
The scope-set attribute defines the visibility of property setter methods. The hbm2java exporter also accepts meta-attributes on the next higher-level, in the root element, which are then applied to all classes mapped in the XML file. You can also add fine-grained meta-attributes to single property, collection, or component mappings. One (albeit small) improvement of the generated entity class is the inclusion of the text of the Message in the output of the generated toString() method. The text is a good visual control element in the log output of the application. You can change the mapping of Message to include it in the generated code:
94
CHAPTER 2
Starting a project
true
The generated code of the toString() method in Message.java looks like this:
public String toString() { StringBuffer buffer = new StringBuffer(); buffer.append(getClass().getName()) .append("@") .append( Integer.toHexString(hashCode()) ) .append(" ["); .append("text").append("='").append(getText()).append("' "); .append("]"); return buffer.toString(); }
Meta-attributes can be inherited; that is, if you declare a use-in-tostring at the level of a element, all properties of that class are included in the toString() method. This inheritance mechanism works for all hbm2java metaattributes, but you can turn it off selectively:
public abstract
Setting inherit to false in the scope-class meta-attribute creates only the parent class of this element as public abstract, but not any of the (possibly) nested subclasses. The hbm2java exporter supports, at the time of writing, 17 meta-attributes for fine-tuning code generation. Most are related to visibility, interface implementation, class extension, and predefined Javadoc comments. Refer to the Hibernate Tools documentation for a complete list. If you use JDK 5.0, you can switch to automatically generated static imports and generics with the jdk5="true" setting on the task. Or, you can produce EJB 3.0 entity classes with annotations. Generating Java Persistence entity classes Normally, you use either Hibernate XML mapping files or JPA annotations in your entity class source code to define your mapping metadata, so generating Java Persistence entity classes with annotations from XML mapping files doesn’t seem reasonable. However, you can create entity class source code with annotations directly from JDBC metadata, and skip the XML mapping step. Look at the following Ant target:
Reverse engineering a legacy database
95
This target generates entity class source code with mapping annotations and a hibernate.cfg.xml file that lists these mapped classes. You can edit the Java source directly to customize the mapping, if the customization in helloworld.reveng.xml is too limited. Also note that all exporters rely on templates written in the FreeMarker template language. You can customize the templates in whatever way you like, or even write your own. Even programmatic customization of code generation is possible. The Hibernate Tools reference documentation shows you how these options are used. Other exporters and configurations are available with the Hibernate Tools:
■
An replaces the regular if you want to read mapping metadata from annotated Java classes, instead of XML mapping files. Its only argument is the location and name of a hibernate.cfg.xml file that contains a list of annotated classes. Use this approach to export a database schema from annotated classes. An is equivalent to an , except that it can scan for annotated Java classes automatically on the classpath; it doesn’t need a hibernate.cfg.xml file. The exporter can create additional Java source for a persistence layer, based on the data access object pattern. At the time of writing, the templates for this exporter are old and need updating. We expect that the finalized templates will be similar to the DAO code shown in chapter 16, section 16.2, “Creating a persistence layer.” The exporter generates HTML files that document the tables and Java entities.
■
■
■
96
CHAPTER 2
Starting a project
■
The exporter can be parameterized with a set of custom FreeMarker templates, and you can generate anything you want with this approach. Templates that produce a complete runable skeleton application with the JBoss Seam framework are bundled in the Hibernate Tools.
You can get creative with the import and export functionality of the tools. For example, you can read annotated Java classes with and export them with . This allows you to develop with JDK 5.0 and the more convenient annotations but deploy Hibernate XML mapping files in production (on JDK 1.4). Let’s finish this chapter with some more advanced configuration options and integrate Hibernate with Java EE services.
2.4
Integration with Java EE services
We assume that you’ve already tried the “Hello World” example shown earlier in this chapter and that you’re familiar with basic Hibernate configuration and how Hibernate can be integrated with a plain Java application. We’ll now discuss more advanced native Hibernate configuration options and how a regular Hibernate application can utilize the Java EE services provided by a Java EE application server. If you created your first JPA project with Hibernate Annotations and Hibernate EntityManager, the following configuration advice isn’t really relevant for you— you’re already deep inside Java EE land if you’re using JPA, and no extra integration steps are required. Hence, you can skip this section if you use Hibernate EntityManager. Java EE application servers such as JBoss AS, BEA WebLogic, and IBM WebSphere implement the standard (Java EE-specific) managed environment for Java. The three most interesting Java EE services Hibernate can be integrated with are JTA, JNDI, and JMX. JTA allows Hibernate to participate in transactions on managed resources. Hibernate can look up managed resources (database connections) via JNDI and also bind itself as a service to JNDI. Finally, Hibernate can be deployed via JMX and then be managed as a service by the JMX container and monitored at runtime with standard JMX clients. Let’s look at each service and how you can integrate Hibernate with it.
Integration with Java EE services
97
2.4.1
Integration with JTA
The Java Transaction API (JTA) is the standardized service interface for transaction control in Java enterprise applications. It exposes several interfaces, such as the UserTransaction API for transaction demarcation and the TransactionManager API for participation in the transaction lifecycle. The transaction manager can coordinate a transaction that spans several resources—imagine working in two Hibernate Sessions on two databases in a single transaction. A JTA transaction service is provided by all Java EE application servers. However, many Java EE services are usable stand-alone, and you can deploy a JTA provider along with your application, such as JBoss Transactions or ObjectWeb JOTM. We won’t have much to say about this part of your configuration but focus on the integration of Hibernate with a JTA service, which is the same in full application servers or with stand-alone JTA providers. Look at figure 2.6. You use the Hibernate Session interface to access your database(s), and it’s Hibernate’s responsibility to integrate with the Java EE services of the managed environment.
Figure 2.6
Hibernate in an environment with managed resources
In such a managed environment, Hibernate no longer creates and maintains a JDBC connection pool—Hibernate obtains database connections by looking up a Datasource object in the JNDI registry. Hence, your Hibernate configuration needs a reference to the JNDI name where managed connections can be obtained.
java:/MyDatasource
98
CHAPTER 2
Starting a project
org.hibernate.dialect.HSQLDialect ...
With this configuration file, Hibernate looks up database connections in JNDI using the name java:/MyDatasource. When you configure your application server and deploy your application, or when you configure your stand-alone JTA provider, this is the name to which you should bind the managed datasource. Note that a dialect setting is still required for Hibernate to produce the correct SQL.
NOTE
Hibernate with Tomcat—Tomcat isn’t a Java EE application server; it’s just a servlet container, albeit a servlet container with some features usually found only in application servers. One of these features may be used with Hibernate: the Tomcat connection pool. Tomcat uses the DBCP connection pool internally but exposes it as a JNDI datasource, just like a real application server. To configure the Tomcat datasource, you need to edit server.xml, according to instructions in the Tomcat JNDI/JDBC documentation. Hibernate can be configured to use this datasource by setting hibernate.connection.datasource. Keep in mind that Tomcat doesn’t ship with a transaction manager, so you still have plain JDBC transaction semantics, which Hibernate can hide with its optional Transaction API. Alternatively, you can deploy a JTA-compatible standalone transaction manager along with your web application, which you should consider to get the standardized UserTransaction API. On the other hand, a regular application server (especially if it’s modular like JBoss AS) may be easier to configure than Tomcat plus DBCP plus JTA, and it provides better services.
To fully integrate Hibernate with JTA, you need to tell Hibernate a bit more about your transaction manager. Hibernate has to hook into the transaction lifecycle, for example, to manage its caches. First, you need to tell Hibernate what transaction manager you’re using:
java:/MyDatasource
Integration with Java EE services
99
org.hibernate.dialect.HSQLDialect org.hibernate.transaction.JBossTransactionManagerLookup org.hibernate.transaction.JTATransactionFactory ...
You need to pick the appropriate lookup class for your application server, as you did in the preceding code—Hibernate comes bundled with classes for the most popular JTA providers and application servers. Finally, you tell Hibernate that you want to use the JTA transaction interfaces in the application to set transaction boundaries. The JTATransactionFactory does several things:
■
It enables correct Session scoping and propagation for JTA if you decide to use the SessionFactory.getCurrentSession() method instead of opening and closing every Session manually. We discuss this feature in more detail in chapter 11, section 11.1, “Propagating the Hibernate session.” It tells Hibernate that you’re planning to call the JTA UserTransaction interface in your application to start, commit, or roll back system transactions. It also switches the Hibernate Transaction API to JTA, in case you don’t want to work with the standardized UserTransaction. If you now begin a transaction with the Hibernate API, it checks whether an ongoing JTA transaction is in progress and, if possible, joins this transaction. If no JTA transaction is in progress, a new transaction is started. If you commit or roll back with the Hibernate API, it either ignores the call (if Hibernate joined an existing transaction) or sets the system transaction to commit or roll back. We don’t recommend using the Hibernate Transaction API if you deploy in an environment that supports JTA. However, this setting keeps existing code portable between managed and nonmanaged environments, albeit with possibly different transactional behavior.
■
■
There are other built-in TransactionFactory options, and you can write your own by implementing this interface. The JDBCTransactionFactory is the default in a nonmanaged environment, and you have used it throughout this chapter in
100
CHAPTER 2
Starting a project
the simple “Hello World” example with no JTA. The CMTTransactionFactory should be enabled if you’re working with JTA and EJBs, and if you plan to set transaction boundaries declaratively on your managed EJB components—in other words, if you deploy your EJB application on a Java EE application server but don’t set transaction boundaries programmatically with the UserTransaction interface in application code. Our recommended configuration options, ordered by preference, are as follows:
■
If your application has to run in managed and nonmanaged environments, you should move the responsibility for transaction integration and resource management to the deployer. Call the JTA UserTransaction API in your application code, and let the deployer of the application configure the application server or a stand-alone JTA provider accordingly. Enable JTATransactionFactory in your Hibernate configuration to integrate with the JTA service, and set the right lookup class. Consider setting transaction boundaries declaratively, with EJB components. Your data access code then isn’t bound to any transaction API, and the CMTTransactionFactory integrates and handles the Hibernate Session for you behind the scenes. This is the easiest solution—of course, the deployer now has the responsibility to provide an environment that supports JTA and EJB components. Write your code with the Hibernate Transaction API and let Hibernate switch between the different deployment environments by setting either JDBCTransactionFactory or JTATransactionFactory. Be aware that transaction semantics may change, and the start or commit of a transaction may result in a no-op you may not expect. This is always the last choice when portability of transaction demarcation is needed.
How can I use several databases with Hibernate? If you want to work with several databases, you create several configuration files. Each database is assigned its own SessionFactory, and you build several SessionFactory instances from distinct Configuration objects. Each Session that is opened, from any SessionFactory, looks up a managed datasource in JNDI. It’s now the responsibility of the transaction and resource manager to coordinate these resources—Hibernate only executes SQL statements on these database connections. Transaction boundaries are either set programmatically with JTA or handled by the container with EJBs and a declarative assembly.
■
■
FAQ
Integration with Java EE services
101
Hibernate can not only look up managed resources in JNDI, it can also bind itself to JNDI. We’ll look at that next.
2.4.2
JNDI-bound SessionFactory
We already touched on a question that every new Hibernate user has to deal with: How should a SessionFactory be stored, and how should it be accessed in application code? Earlier in this chapter, we addressed this problem by writing a HibernateUtil class that held a SessionFactory in a static field and provided the static getSessionFactory() method. However, if you deploy your application in an environment that supports JNDI, Hibernate can bind a SessionFactory to JNDI, and you can look it up there when needed.
NOTE
The Java Naming and Directory Interface API (JNDI) allows objects to be stored to and retrieved from a hierarchical structure (directory tree). JNDI implements the Registry pattern. Infrastructural objects (transaction contexts, datasources, and so on), configuration settings (environment settings, user registries, and so on) and even application objects (EJB references, object factories, and so on) can all be bound to JNDI.
The Hibernate SessionFactory automatically binds itself to JNDI if the hibernate.session_factory_name property is set to the name of the JNDI node. If your runtime environment doesn’t provide a default JNDI context (or if the default JNDI implementation doesn’t support instances of Referenceable), you need to specify a JNDI initial context using the hibernate.jndi.url and hibernate.jndi.class properties. Here is an example Hibernate configuration that binds the SessionFactory to the name java:/hibernate/MySessionFactory using Sun’s (free) file-systembased JNDI implementation, fscontext.jar:
hibernate.connection.datasource = java:/MyDatasource hibernate.transaction.factory_class = \ org.hibernate.transaction.JTATransactionFactory hibernate.transaction.manager_lookup_class = \ org.hibernate.transaction.JBossTransactionManagerLookup hibernate.dialect = org.hibernate.dialect.PostgreSQLDialect hibernate.session_factory_name = java:/hibernate/MySessionFactory hibernate.jndi.class = com.sun.jndi.fscontext.RefFSContextFactory hibernate.jndi.url = file:/auction/jndi
You can, of course, also use the XML-based configuration for this. This example isn’t realistic, because most application servers that provide a connection pool through JNDI also have a JNDI implementation with a writable default context.
102
CHAPTER 2
Starting a project
JBoss AS certainly has, so you can skip the last two properties and just specify a name for the SessionFactory.
NOTE
JNDI with Tomcat —Tomcat comes bundled with a read-only JNDI context, which isn’t writable from application-level code after the startup of the servlet container. Hibernate can’t bind to this context: You have to either use a full context implementation (like the Sun FS context) or disable JNDI binding of the SessionFactory by omitting the session_ factory_name property in the configuration.
The SessionFactory is bound to JNDI when you build it, which means when Configuration.buildSessionFactory() is called. To keep your application code portable, you may want to implement this build and the lookup in HibernateUtil, and continue using that helper class in your data access code, as shown in listing 2.18.
Listing 2.18 HibernateUtil for JNDI lookup of SessionFactory
public class HibernateUtil { private static Context jndiContext; static { try { // Build it and bind it to JNDI new Configuration().buildSessionFactory(); // Get a handle to the registry (reads jndi.properties) jndiContext = new InitialContext(); } catch (Throwable ex) { throw new ExceptionInInitializerError(ex); } } public static SessionFactory getSessionFactory(String sfName) { SessionFactory sf; try { sf = (SessionFactory) jndiContext.lookup(sfName); } catch (NamingException ex) { throw new RuntimeException(ex); } return sf; } }
Integration with Java EE services
103
Alternatively, you can look up the SessionFactory directly in application code with a JNDI call. However, you still need at least the new Configuration().buildSessionFactory() line of startup code somewhere in your application. One way to remove this last line of Hibernate startup code, and to completely eliminate the HibernateUtil class, is to deploy Hibernate as a JMX service (or by using JPA and Java EE).
2.4.3
JMX service deployment
The Java world is full of specifications, standards, and implementations of these. A relatively new, but important, standard is in its first version: the Java Management Extensions (JMX). JMX is about the management of systems components or, better, of system services. Where does Hibernate fit into this new picture? Hibernate, when deployed in an application server, makes use of other services, like managed transactions and pooled datasources. Also, with Hibernate JMX integration, Hibernate can be a managed JMX service, depended on and used by others. The JMX specification defines the following components:
■
The JMX MBean—A reusable component (usually infrastructural) that exposes an interface for management (administration) The JMX container—Mediates generic access (local or remote) to the MBean The JMX client—May be used to administer any MBean via the JMX container
■ ■
An application server with support for JMX (such as JBoss AS) acts as a JMX container and allows an MBean to be configured and initialized as part of the application server startup process. Your Hibernate service may be packaged and deployed as a JMX MBean; the bundled interface for this is org.hibernate.jmx .HibernateService. You can start, stop, and monitor the Hibernate core through this interface with any standard JMX client. A second MBean interface that can be deployed optionally is org.hibernate.jmx.StatisticsService, which lets you enable and monitor Hibernate’s runtime behavior with a JMX client. How JMX services and MBeans are deployed is vendor-specific. For example, on JBoss Application Server, you only have to add a jboss-service.xml file to your application’s EAR to deploy Hibernate as a managed JMX service. Instead of explaining every option here, see the reference documentation for JBoss Application Server. It contains a section that shows Hibernate integration and deployment step by step (http://docs.jboss.org/jbossas). Configuration and
104
CHAPTER 2
Starting a project
deployment on other application servers that support JMX should be similar, and you can adapt and port the JBoss configuration files.
2.5
Summary
In this chapter, you have completed a first Hibernate project. We looked at how Hibernate XML mapping files are written and what APIs you can call in Hibernate to interact with the database. We then introduced Java Persistence and EJB 3.0 and explained how it can simplify even the most basic Hibernate application with automatic metadata scanning, standardized configuration and packaging, and dependency injection in managed EJB components. If you have to get started with a legacy database, you can use the Hibernate toolset to reverse engineer XML mapping files from an existing schema. Or, if you work with JDK 5.0 and/or EJB 3.0, you can generate Java application code directly from an SQL database. Finally, we looked at more advanced Hibernate integration and configuration options in a Java EE environment—integration that is already done for you if you rely on JPA or EJB 3.0. A high-level overview and comparison between Hibernate functionality and Java Persistence is shown in table 2.1. (You can find a similar comparison table at the end of each chapter.)
Table 2.1 Hibernate and JPA comparison Hibernate Core Integrates with everything, everywhere. Flexible, but sometimes configuration is complex. Java Persistence and EJB 3.0 Works in Java EE and Java SE. Simple and standardized configuration; no extra integration or special configuration is necessary in Java EE environments. JPA provider scans for XML mapping files and annotated classes automatically. Standardized and stable interfaces, with a sufficient subset of Hibernate functionality. Easy fallback to Hibernate APIs is possible.
Configuration requires a list of XML mapping files or annotated classes. Proprietary but powerful. Continually improved native programming interfaces and query language.
In the next chapter, we introduce a more complex example application that we’ll work with throughout the rest of the book. You’ll see how to design and implement a domain model, and which mapping metadata options are the best choices in a larger project.
Domain models and metadata
This chapter covers
■ ■ ■
The CaveatEmptor example application POJO design for rich domain models Object/relational mapping metadata options
105
106
CHAPTER 3
Domain models and metadata
The “Hello World” example in the previous chapter introduced you to Hibernate; however, it isn’t useful for understanding the requirements of real-world applications with complex data models. For the rest of the book, we use a much more sophisticated example application—CaveatEmptor, an online auction system—to demonstrate Hibernate and Java Persistence. We start our discussion of the application by introducing a programming model for persistent classes. Designing and implementing the persistent classes is a multistep process that we’ll examine in detail. First, you’ll learn how to identify the business entities of a problem domain. You create a conceptual model of these entities and their attributes, called a domain model, and you implement it in Java by creating persistent classes. We spend some time exploring exactly what these Java classes should look like, and we also look at the persistence capabilities of the classes, and how this aspect influences the design and implementation. We then explore mapping metadata options—the ways you can tell Hibernate how your persistent classes and their properties relate to database tables and columns. This can involve writing XML documents that are eventually deployed along with the compiled Java classes and are read by Hibernate at runtime. Another option is to use JDK 5.0 metadata annotations, based on the EJB 3.0 standard, directly in the Java source code of the persistent classes. After reading this chapter, you’ll know how to design the persistent parts of your domain model in complex real-world projects, and what mapping metadata option you’ll primarily prefer and use. Finally, in the last (probably optional) section of this chapter, we look at Hibernate’s capability for representation independence. A relatively new feature in Hibernate allows you to create a domain model in Java that is fully dynamic, such as a model without any concrete classes but only HashMaps. Hibernate also supports a domain model representation with XML documents. Let’s start with the example application.
3.1
The CaveatEmptor application
The CaveatEmptor online auction application demonstrates ORM techniques and Hibernate functionality; you can download the source code for the application from http://caveatemptor.hibernate.org. We won’t pay much attention to the user interface in this book (it could be web based or a rich client); we’ll concentrate instead on the data access code. However, when a design decision about data
The CaveatEmptor application
107
access code that has consequences for the user interface has to be made, we’ll naturally consider both. In order to understand the design issues involved in ORM, let’s pretend the CaveatEmptor application doesn’t yet exist, and that you’re building it from scratch. Our first task would be analysis.
3.1.1
Analyzing the business domain
A software development effort begins with analysis of the problem domain (assuming that no legacy code or legacy database already exists). At this stage, you, with the help of problem domain experts, identify the main entities that are relevant to the software system. Entities are usually notions understood by users of the system: payment, customer, order, item, bid, and so forth. Some entities may be abstractions of less concrete things the user thinks about, such as a pricing algorithm, but even these would usually be understandable to the user. All these entities are found in the conceptual view of the business, which we sometimes call a business model. Developers and architects of object-oriented software analyze the business model and create an object-oriented model, still at the conceptual level (no Java code). This model may be as simple as a mental image existing only in the mind of the developer, or it may be as elaborate as a UML class diagram created by a computer-aided software engineering (CASE) tool like ArgoUML or TogetherJ. A simple model expressed in UML is shown in figure 3.1. This model contains entities that you’re bound to find in any typical auction system: category, item, and user. The entities and their relationships (and perhaps their attributes) are all represented by this model of the problem domain. We call this kind of object-oriented model of entities from the problem domain, encompassing only those entities that are of interest to the user, a domain model. It’s an abstract view of the real world. The motivating goal behind the analysis and design of a domain model is to capture the essence of the business information for the application’s purpose. Developers and architects may, instead of an object-oriented model, also start the application design with a data model (possibly expressed with an Entity-Relationship diagram). We usually say that, with regard to persistence, there is little
Figure 3.1
A class diagram of a typical online auction model
108
CHAPTER 3
Domain models and metadata
difference between the two; they’re merely different starting points. In the end, we’re most interested in the structure and relationships of the business entities, the rules that have to be applied to guarantee the integrity of data (for example, the multiplicity of relationships), and the logic used to manipulate the data. In object modeling, there is a focus on polymorphic business logic. For our purpose and top-down development approach, it’s helpful if we can implement our logical model in polymorphic Java; hence the first draft as an object-oriented model. We then derive the logical relational data model (usually without additional diagrams) and implement the actual physical database schema. Let’s see the outcome of our analysis of the problem domain of the CaveatEmptor application.
3.1.2
The CaveatEmptor domain model
The CaveatEmptor site auctions many different kinds of items, from electronic equipment to airline tickets. Auctions proceed according to the English auction strategy: Users continue to place bids on an item until the bid period for that item expires, and the highest bidder wins. In any store, goods are categorized by type and grouped with similar goods into sections and onto shelves. The auction catalog requires some kind of hierarchy of item categories so that a buyer can browse these categories or arbitrarily search by category and item attributes. Lists of items appear in the category browser and search result screens. Selecting an item from a list takes the buyer to an item-detail view. An auction consists of a sequence of bids, and one is the winning bid. User details include name, login, address, email address, and billing information. A web of trust is an essential feature of an online auction site. The web of trust allows users to build a reputation for trustworthiness (or untrustworthiness). Buyers can create comments about sellers (and vice versa), and the comments are visible to all other users. A high-level overview of our domain model is shown in figure 3.2. Let’s briefly discuss some interesting features of this model. Each item can be auctioned only once, so you don’t need to make Item distinct from any auction entities. Instead, you have a single auction item entity named Item. Thus, Bid is associated directly with Item. Users can write Comments about other users only in the context of an auction; hence the association between Item and Comment. The Address information of a User is modeled as a separate class, even though the User may have only one Address; they may alternatively have three, for home, billing, and shipping. You do allow the user to have
The CaveatEmptor application
109
delivery
inspectionPeriodDays : int state : ShipmentState created : Date
successful children parent
seller buyer
0..1
0..* amount : BigDecimal created : Date
0..* 0..1
name : String
1..* 0..*
name : String description : String initialPrice : BigDecimal reservePrice : BigDecimal startDate : Date endDate : Date state : ItemState approvalDatetime : Date
about
sold by
home
0..*
bought
0..*
firstname : String lastname : String username : String password : String email : String ranking : int admin : boolean
0..* BillingDetails ownername : String
billing shipping
street : String zipcode : String city : String
default from
rating : Rating text : String created : Date
type : CreditCardType number : String expMonth : String expYear : String
number : String bankname : String swift : String
Figure 3.2
Persistent classes of the CaveatEmptor domain model and their relationships
many BillingDetails. The various billing strategies are represented as subclasses of an abstract class (allowing future extension). A Category may be nested inside another Category. This is expressed by a recursive association, from the Category entity to itself. Note that a single Category may have multiple child categories but at most one parent. Each Item belongs to at least one Category. The entities in a domain model should encapsulate state and behavior. For example, the User entity should define the name and address of a customer and the logic required to calculate the shipping costs for items (to this particular customer). The domain model is a rich object model, with complex associations, interactions, and inheritance relationships. An interesting and detailed discussion of object-oriented techniques for working with domain models can be found in Patterns of Enterprise Application Architecture (Fowler, 2003) or in Domain-Driven Design (Evans, 2003).
110
CHAPTER 3
Domain models and metadata
In this book, we won’t have much to say about business rules or about the behavior of our domain model. This isn’t because we consider it unimportant; rather, this concern is mostly orthogonal to the problem of persistence. It’s the state of our entities that is persistent, so we concentrate our discussion on how to best represent state in our domain model, not on how to represent behavior. For example, in this book, we aren’t interested in how tax for sold items is calculated or how the system may approve a new user account. We’re more interested in how the relationship between users and the items they sell is represented and made persistent. We’ll revisit this issue in later chapters, whenever we have a closer look at layered application design and the separation of logic and data access.
NOTE
ORM without a domain model—We stress that object persistence with full ORM is most suitable for applications based on a rich domain model. If
your application doesn’t implement complex business rules or complex interactions between entities (or if you have few entities), you may not need a domain model. Many simple and some not-so-simple problems are perfectly suited to table-oriented solutions, where the application is designed around the database data model instead of around an objectoriented domain model, often with logic executed in the database (stored procedures). However, the more complex and expressive your domain model, the more you’ll benefit from using Hibernate; it shines when dealing with the full complexity of object/relational persistence.
Now that you have a (rudimentary) application design with a domain model, the next step is to implement it in Java. Let’s look at some of the things you need to consider.
3.2
Implementing the domain model
Several issues typically must be addressed when you implement a domain model in Java. For instance, how do you separate the business concerns from the crosscutting concerns (such as transactions and even persistence)? Do you need automated or transparent persistence? Do you have to use a specific programming model to achieve this? In this section, we examine these types of issues and how to address them in a typical Hibernate application. Let’s start with an issue that any implementation must deal with: the separation of concerns. The domain model implementation is usually a central, organizing component; it’s reused heavily whenever you implement new application functionality. For this reason, you should be prepared to go to some lengths to ensure
Implementing the domain model
111
that concerns other than business aspects don’t leak into the domain model implementation.
3.2.1
Addressing leakage of concerns
The domain model implementation is such an important piece of code that it shouldn’t depend on orthogonal Java APIs. For example, code in the domain model shouldn’t perform JNDI lookups or call the database via the JDBC API. This allows you to reuse the domain model implementation virtually anywhere. Most importantly, it makes it easy to unit test the domain model without the need for a particular runtime environment or container (or the need for mocking any service dependencies). This separation emphasizes the distinction between logical unit testing and integration unit testing. We say that the domain model should be concerned only with modeling the business domain. However, there are other concerns, such as persistence, transaction management, and authorization. You shouldn’t put code that addresses these crosscutting concerns in the classes that implement the domain model. When these concerns start to appear in the domain model classes, this is an example of leakage of concerns. The EJB standard solves the problem of leaky concerns. If you implement your domain classes using the entity programming model, the container takes care of some concerns for you (or at least lets you externalize those concerns into metadata, as annotations or XML descriptors). The EJB container prevents leakage of certain crosscutting concerns using interception. An EJB is a managed component, executed inside the EJB container; the container intercepts calls to your beans and executes its own functionality. This approach allows the container to implement the predefined crosscutting concerns—security, concurrency, persistence, transactions, and remoteness—in a generic way. Unfortunately, the EJB 2.1 specification imposes many rules and restrictions on how you must implement a domain model. This, in itself, is a kind of leakage of concerns—in this case, the concerns of the container implementer have leaked! This was addressed in the EJB 3.0 specification, which is nonintrusive and much closer to the traditional JavaBean programming model. Hibernate isn’t an application server, and it doesn’t try to implement all the crosscutting concerns of the full EJB specification. Hibernate is a solution for just one of these concerns: persistence. If you require declarative security and transaction management, you should access entity instances via a session bean, taking advantage of the EJB container’s implementation of these concerns. Hibernate in
112
CHAPTER 3
Domain models and metadata
an EJB container either replaces (EJB 2.1, entity beans with CMP) or implements (EJB 3.0, Java Persistence entities) the persistence aspect. Hibernate persistent classes and the EJB 3.0 entity programming model offer transparent persistence. Hibernate and Java Persistence also provide automatic persistence. Let’s explore both terms in more detail and find an accurate definition.
3.2.2
Transparent and automated persistence
We use transparent to mean a complete separation of concerns between the persistent classes of the domain model and the persistence logic, where the persistent classes are unaware of—and have no dependency on—the persistence mechanism. We use automatic to refer to a persistence solution that relieves you of handling low-level mechanical details, such as writing most SQL statements and working with the JDBC API. The Item class, for example, doesn’t have any code-level dependency on any Hibernate API. Furthermore:
■
Hibernate doesn’t require that any special superclasses or interfaces be inherited or implemented by persistent classes. Nor are any special classes used to implement properties or associations. (Of course, the option to use both techniques is always there.) Transparent persistence improves code readability and maintenance, as you’ll soon see. Persistent classes can be reused outside the context of persistence, in unit tests or in the user interface (UI) tier, for example. Testability is a basic requirement for applications with rich domain models. In a system with transparent persistence, objects aren’t aware of the underlying data store; they need not even be aware that they are being persisted or retrieved. Persistence concerns are externalized to a generic persistence manager interface—in the case of Hibernate, the Session and Query. In JPA, the EntityManager and Query (which has the same name, but a different package and slightly different API) play the same roles.
■
■
Transparent persistence fosters a degree of portability; without special interfaces, the persistent classes are decoupled from any particular persistence solution. Our business logic is fully reusable in any other application context. You could easily change to another transparent persistence mechanism. Because JPA follows the same basic principles, there is no difference between Hibernate persistent classes and JPA entity classes.
Implementing the domain model
113
By this definition of transparent persistence, certain nonautomated persistence layers are transparent (for example, the DAO pattern) because they decouple the persistence-related code with abstract programming interfaces. Only plain Java classes without dependencies are exposed to the business logic or contain the business logic. Conversely, some automated persistence layers (including EJB 2.1 entity instances and some ORM solutions) are nontransparent because they require special interfaces or intrusive programming models. We regard transparency as required. Transparent persistence should be one of the primary goals of any ORM solution. However, no automated persistence solution is completely transparent: Every automated persistence layer, including Hibernate, imposes some requirements on the persistent classes. For example, Hibernate requires that collection-valued properties be typed to an interface such as java.util.Set or java.util.List and not to an actual implementation such as java.util.HashSet (this is a good practice anyway). Or, a JPA entity class has to have a special property, called the database identifier. You now know why the persistence mechanism should have minimal impact on how you implement a domain model, and that transparent and automated persistence are required. What kind of programming model should you use? What are the exact requirements and contracts to observe? Do you need a special programming model at all? In theory, no; in practice, however, you should adopt a disciplined, consistent programming model that is well accepted by the Java community.
3.2.3
Writing POJOs and persistent entity classes
As a reaction against EJB 2.1 entity instances, many developers started talking about Plain Old Java Objects (POJOs),1 a back-to-basics approach that essentially revives JavaBeans, a component model for UI development, and reapplies it to the business layer. (Most developers now use the terms POJO and JavaBean almost synonymously.) The overhaul of the EJB specification brought us new lightweight entities, and it would be appropriate to call them persistence-capable JavaBeans. Java developers will soon use all three terms as synonyms for the same basic design approach. In this book, we use persistent class for any class implementation that is capable of persistent instances, we use POJO if some Java best practices are relevant,
1
POJO is sometimes also written Plain Ordinary Java Objects. This term was coined in 2002 by Martin Fowler, Rebecca Parsons, and Josh Mackenzie.
114
CHAPTER 3
Domain models and metadata
and we use entity class when the Java implementation follows the EJB 3.0 and JPA specifications. Again, you shouldn’t be too concerned about these differences, because the ultimate goal is to apply the persistence aspect as transparently as possible. Almost every Java class can be a persistent class, or a POJO, or an entity class if some good practices are followed. Hibernate works best with a domain model implemented as POJOs. The few requirements that Hibernate imposes on your domain model implementation are also best practices for the POJO implementation, so most POJOs are Hibernatecompatible without any changes. Hibernate requirements are almost the same as the requirements for EJB 3.0 entity classes, so a POJO implementation can be easily marked up with annotations and made an EJB 3.0 compatible entity. A POJO declares business methods, which define behavior, and properties, which represent state. Some properties represent associations to other userdefined POJOs. A simple POJO class is shown in listing 3.1. This is an implementation of the User entity of your domain model.
Listing 3.1 POJO implementation of the User class
public class User implements Serializable { private String username; private Address address; public User() {}
Declaration of Serializable
No-argument class constructor
public String getUsername() { return username; } public void setUsername(String username) { this.username = username; } public Address getAddress() { return address; } public void setAddress(Address address) { this.address = address; } public MonetaryAmount calcShippingCosts(Address fromLocation) { ... } Business method }
Property accessor methods
Implementing the domain model
115
Hibernate doesn’t require that persistent classes implement Serializable. However, when objects are stored in an HttpSession or passed by value using RMI, serialization is necessary. (This is likely to happen in a Hibernate application.) The class can be abstract and, if needed, extend a nonpersistent class. Unlike the JavaBeans specification, which requires no specific constructor, Hibernate (and JPA) require a constructor with no arguments for every persistent class. Hibernate calls persistent classes using the Java Reflection API on this constructor to instantiate objects. The constructor may be nonpublic, but it has to be at least package-visible if runtime-generated proxies will be used for performance optimization. Proxy generation also requires that the class isn’t declared final (nor has final methods)! (We’ll come back to proxies in chapter 13, section 13.1, “Defining the global fetch plan.”) The properties of the POJO implement the attributes of the business entities— for example, the username of User. Properties are usually implemented as private or protected instance variables, together with public property accessor methods: a method for retrieving the value of the instance variable and a method for changing its value. These methods are known as the getter and setter, respectively. The example POJO in listing 3.1 declares getter and setter methods for the username and address properties. The JavaBean specification defines the guidelines for naming these methods, and they allow generic tools like Hibernate to easily discover and manipulate the property value. A getter method name begins with get, followed by the name of the property (the first letter in uppercase); a setter method name begins with set and similarly is followed by the name of the property. Getter methods for Boolean properties may begin with is instead of get. You can choose how the state of an instance of your persistent classes should be persisted by Hibernate, either through direct access to its fields or through accessor methods. Your class design isn’t disturbed by these considerations. You can make some accessor methods nonpublic or completely remove them. Some getter and setter methods do something more sophisticated than access instance variables (validation, for example), but trivial accessor methods are common. Their primary advantage is providing an additional buffer between the internal representation and the public interface of the class, allowing independent refactoring of both. The example in listing 3.1 also defines a business method that calculates the cost of shipping an item to a particular user (we left out the implementation of this method).
116
CHAPTER 3
Domain models and metadata
What are the requirements for JPA entity classes? The good news is that so far, all the conventions we’ve discussed for POJOs are also requirements for JPA entities. You have to apply some additional rules, but they’re equally simple; we’ll come back to them later. Now that we’ve covered the basics of using POJO persistent classes as a programming model, let’s see how to handle the associations between those classes.
3.2.4
Implementing POJO associations
You use properties to express associations between POJO classes, and you use accessor methods to navigate from object to object at runtime. Let’s consider the associations defined by the Category class, as shown in figure 3.3. As with all our diagrams, we left out the associationrelated attributes (let’s call them parentCategory and Figure 3.3 Diagram of the Category childCategories) because they would clutter the illustra- class with associations tion. These attributes and the methods that manipulate their values are called scaffolding code. This is what the scaffolding code for the one-to-many self-association of Category looks like:
public class Category { private String name; private Category parentCategory; private Set childCategories = new HashSet(); public Category() { } ... }
To allow bidirectional navigation of the association, you require two attributes. The parentCategory field implements the single-valued end of the association and is declared to be of type Category. The many-valued end, implemented by the childCategories field, must be of collection type. You choose a Set, because duplicates are disallowed, and initialize the instance variable to a new instance of HashSet. Hibernate requires interfaces for collection-typed attributes, so you must use java.util.Set or java.util.List rather than HashSet, for example. This is consistent with the requirements of the JPA specification for collections in entities. At runtime, Hibernate wraps the HashSet instance with an instance of one of Hibernate’s own classes. (This special class isn’t visible to the application code.) It’s
Implementing the domain model
117
good practice to program to collection interfaces anyway, rather than concrete implementations, so this restriction shouldn’t bother you. You now have some private instance variables but no public interface to allow access from business code or property management by Hibernate (if it shouldn’t access the fields directly). Let’s add some accessor methods to the class:
public String getName() { return name; } public void setName(String name) { this.name = name; } public Set getChildCategories() { return childCategories; } public void setChildCategories(Set childCategories) { this.childCategories = childCategories; } public Category getParentCategory() { return parentCategory; } public void setParentCategory(Category parentCategory) { this.parentCategory = parentCategory; }
Again, these accessor methods need to be declared public only if they’re part of the external interface of the persistent class used by the application logic to create a relationship between two objects. However, managing the link between two Category instances is more difficult than setting a foreign key value in a database field. In our experience, developers are often unaware of this complication that arises from a network object model with bidirectional references. Let’s walk through the issue step by step. The basic procedure for adding a child Category to a parent Category looks like this:
Category aParent = new Category(); Category aChild = new Category(); aChild.setParentCategory(aParent); aParent.getChildCategories().add(aChild);
Whenever a link is created between a parent Category and a child Category, two actions are required:
118
CHAPTER 3
Domain models and metadata
■
The parentCategory of the child must be set, effectively breaking the association between the child and its old parent (there can only be one parent for any child). The child must be added to the childCategories collection of the new parent Category.
Managed relationships in Hibernate—Hibernate doesn’t manage persistent associations. If you want to manipulate an association, you must write exactly the same code you would write without Hibernate. If an association is bidirectional, both sides of the relationship must be considered. Programming models like EJB 2.1 entity beans muddled this behavior by introducing container-managed relationships—the container automatically changes the other side of a relationship if one side is modified by the application. This is one of the reasons why code that uses EJB 2.1 entity beans couldn’t be reused outside the container. EJB 3.0 entity associations are transparent, just like in Hibernate. If you ever have problems understanding the behavior of associations in Hibernate, just ask yourself, “What would I do without Hibernate?” Hibernate doesn’t change the regular Java semantics.
■
NOTE
It’s a good idea to add a convenience method to the Category class that groups these operations, allowing reuse and helping ensure correctness, and in the end guarantee data integrity:
public void addChildCategory(Category childCategory) { if (childCategory == null) throw new IllegalArgumentException("Null child category!"); if (childCategory.getParentCategory() != null) childCategory.getParentCategory().getChildCategories() .remove(childCategory); childCategory.setParentCategory(this); childCategories.add(childCategory); }
The addChildCategory() method not only reduces the lines of code when dealing with Category objects, but also enforces the cardinality of the association. Errors that arise from leaving out one of the two required actions are avoided. This kind of grouping of operations should always be provided for associations, if possible. If you compare this with the relational model of foreign keys in a relational database, you can easily see how a network and pointer model complicates a simple operation: instead of a declarative constraint, you need procedural code to guarantee data integrity.
Implementing the domain model
119
Because you want addChildCategory() to be the only externally visible mutator method for the child categories (possibly in addition to a removeChildCategory() method), you can make the setChildCategories() method private or drop it and use direct field access for persistence. The getter method still returns a modifiable collection, so clients can use it to make changes that aren’t reflected on the inverse side. You should consider the static methods Collections.unmodifiableCollection(c) and Collections.unmodifiableSet(s), if you prefer to wrap the internal collections before returning them in your getter method. The client then gets an exception if it tries to modify the collection; every modification is forced to go through the relationship-management method. A different kind of relationship exists between the Category and Item classes: a bidirectional many-to-many association, as shown in figure 3.4.
Figure 3.4 Category and the associated Item class
In the case of a many-to-many association, both sides are implemented with collection-valued attributes. Let’s add the new attributes and methods for accessing the Item relationship to the Category class, as shown in listing 3.2.
Listing 3.2 Category to Item scaffolding code
public class Category { ... private Set items = new HashSet(); ... public Set getItems() { return items; } public void setItems(Set items) { this.items = items; } }
120
CHAPTER 3
Domain models and metadata
The code for the Item class (the other end of the many-to-many association) is similar to the code for the Category class. You add the collection attribute, the standard accessor methods, and a method that simplifies relationship management, as in listing 3.3.
Listing 3.3 Item to Category scaffolding code
public class Item { private String name; private String description; ... private Set categories = new HashSet(); ... public Set getCategories() { return categories; } private void setCategories(Set categories) { this.categories = categories; } public void addCategory(Category category) { if (category == null) throw new IllegalArgumentException("Null category"); category.getItems().add(this); categories.add(category); } }
The addCategory() method is similar to the addChildCategory() convenience method of the Category class. It’s used by a client to manipulate the link between an Item and a Category. For the sake of readability, we won’t show convenience methods in future code samples and assume you’ll add them according to your own taste. Using convenience methods for association handling isn’t the only way to improve a domain model implementation. You can also add logic to your accessor methods.
3.2.5
Adding logic to accessor methods
One of the reasons we like to use JavaBeans-style accessor methods is that they provide encapsulation: The hidden internal implementation of a property can be changed without any changes to the public interface. This lets you abstract the internal data structure of a class—the instance variables—from the design of the
Implementing the domain model
121
database, if Hibernate accesses the properties at runtime through accessor methods. It also allows easier and independent refactoring of the public API and the internal representation of a class. For example, if your database stores the name of a user as a single NAME column, but your User class has firstname and lastname properties, you can add the following persistent name property to the class:
public class User { private String firstname; private String lastname; ... public String getName() { return firstname + ' ' + lastname; } public void setName(String name) { StringTokenizer t = new StringTokenizer(name); firstname = t.nextToken(); lastname = t.nextToken(); ) .... }
Later, you’ll see that a Hibernate custom type is a better way to handle many of these kinds of situations. However, it helps to have several options. Accessor methods can also perform validation. For instance, in the following example, the setFirstName() method verifies that the name is capitalized:
public class User { private String firstname; ... public String getFirstname() { return firstname; } public void setFirstname(String firstname) throws InvalidNameException { if ( !StringUtil.isCapitalizedName(firstname) ) throw new InvalidNameException(firstname); this.firstname = firstname; ) .... }
Hibernate may use the accessor methods to populate the state of an instance when loading an object from a database, and sometimes you’ll prefer that this validation
122
CHAPTER 3
Domain models and metadata
not occur when Hibernate is initializing a newly loaded object. In that case, it makes sense to tell Hibernate to directly access the instance variables. Another issue to consider is dirty checking. Hibernate automatically detects object state changes in order to synchronize the updated state with the database. It’s usually safe to return a different object from the getter method than the object passed by Hibernate to the setter. Hibernate compares the objects by value—not by object identity—to determine whether the property’s persistent state needs to be updated. For example, the following getter method doesn’t result in unnecessary SQL UPDATEs:
public String getFirstname() { return new String(firstname); }
There is one important exception to this: Collections are compared by identity! For a property mapped as a persistent collection, you should return exactly the same collection instance from the getter method that Hibernate passed to the setter method. If you don’t, Hibernate will update the database, even if no update is necessary, every time the state held in memory is synchronized with the database. This kind of code should almost always be avoided in accessor methods:
public void setNames(List namesList) { names = (String[]) namesList.toArray(); } public List getNames() { return Arrays.asList(names); }
Finally, you have to know how exceptions in accessor methods are handled if you configure Hibernate to use these methods when loading and storing instances. If a RuntimeException is thrown, the current transaction is rolled back, and the exception is yours to handle. If a checked application exception is thrown, Hibernate wraps the exception into a RuntimeException. You can see that Hibernate doesn’t unnecessarily restrict you with a POJO programming model. You’re free to implement whatever logic you need in accessor methods (as long as you keep the same collection instance in both getter and setter). How Hibernate accesses the properties is completely configurable. This kind of transparency guarantees an independent and reusable domain model implementation. And everything we have explained and said so far is equally true for both Hibernate persistent classes and JPA entities. Let’s now define the object/relational mapping for the persistent classes.
Object/relational mapping metadata
123
3.3
Object/relational mapping metadata
ORM tools require metadata to specify the mapping between classes and tables, properties and columns, associations and foreign keys, Java types and SQL types,
and so on. This information is called the object/relational mapping metadata. Metadata is data about data, and mapping metadata defines and governs the transformation between the different type systems and relationship representations in object-oriented and SQL systems. It’s your job as a developer to write and maintain this metadata. We discuss various approaches in this section, including metadata in XML files and JDK 5.0 source code annotations. Usually you decide to use one strategy in a particular project, and after reading these sections you’ll have the background information to make an educated decision.
3.3.1
Metadata in XML
Any ORM solution should provide a human-readable, easily hand-editable mapping format, not just a GUI mapping tool. Currently, the most popular object/ relational metadata format is XML. Mapping documents written in and with XML are lightweight, human readable, easily manipulated by version-control systems and text editors, and they can be customized at deployment time (or even at runtime, with programmatic XML generation). But is XML-based metadata really the best approach? A certain backlash against the overuse of XML can be seen in the Java community. Every framework and application server seems to require its own XML descriptors. In our view, there are three main reasons for this backlash:
■
Metadata-based solutions have often been used inappropriately. Metadata is not, by nature, more flexible or maintainable than plain Java code. Many existing metadata formats weren’t designed to be readable and easy to edit by hand. In particular, a major cause of pain is the lack of sensible defaults for attribute and element values, requiring significantly more typing than should be necessary. Even worse, some metadata schemas use only XML elements and text values, without any attributes. Another problem is schemas that are too generic, where every declaration is wrapped in a generic extension attribute of a meta element. Good XML editors, especially in IDEs, aren’t as common as good Java coding environments. Worst, and most easily fixable, a document type declaration (DTD) often isn’t provided, preventing autocompletion and validation.
■
■
124
CHAPTER 3
Domain models and metadata
There is no getting around the need for metadata in ORM. However, Hibernate was designed with full awareness of the typical metadata problems. The XML metadata format of Hibernate is extremely readable and defines useful default values. If attribute values are missing, reflection is used on the mapped class to determine defaults. Hibernate also comes with a documented and complete DTD. Finally, IDE support for XML has improved lately, and modern IDEs provide dynamic XML validation and even an autocomplete feature. Let’s look at the way you can use XML metadata in Hibernate. You created the Category class in the previous section; now you need to map it to the CATEGORY table in the database. To do that, you write the XML mapping document in listing 3.4.
Listing 3.4 Hibernate XML mapping of the Category class
B
C
D
E
F
B C
The Hibernate mapping DTD should be declared in every mapping file—it’s required for syntactic validation of the XML. Mappings are declared inside a element. You may include as many class mappings as you like, along with certain other special declarations that we’ll mention later in the book.
Object/relational mapping metadata
125
D E
The class Category (in the auction.model package) is mapped to the CATEGORY table. Every row in this table represents one instance of type Category. We haven’t discussed the concept of object identity, so you may be surprised by this mapping element. This complex topic is covered in the next chapter. To understand this mapping, it’s sufficient to know that every row in the CATEGORY table has a primary key value that matches the object identity of the instance in memory. The mapping element is used to define the details of object identity. The property name of type java.lang.String is mapped to a database NAME column. Note that the type declared in the mapping is a built-in Hibernate type (string), not the type of the Java property or the SQL column type. Think about this as the converter that represents a bridge between the other two type systems. We’ve intentionally left the collection and association mappings out of this example. Association and especially collection mappings are more complex, so we’ll return to them in the second part of the book. Although it’s possible to declare mappings for multiple classes in one mapping file by using multiple elements, the recommended practice (and the practice expected by some Hibernate tools) is to use one mapping file per persistent class. The convention is to give the file the same name as the mapped class, appending a suffix (for example, Category.hbm.xml), and putting it in the same package as the Category class. As already mentioned, XML mapping files aren’t the only way to define mapping metadata in a Hibernate application. If you use JDK 5.0, your best choice is the Hibernate Annotations based on the EJB 3.0 and Java Persistence standard.
F
3.3.2
Annotation-based metadata
The basic idea is to put metadata next to the information it describes, instead of separating it physically into a different file. Java didn’t have this functionality before JDK 5.0, so an alternative was developed. The XDoclet project introduced annotation of Java source code with meta-information, using special Javadoc tags with support for key/value pairs. Through nesting of tags, quite complex structures are supported, but only some IDEs allow customization of Javadoc templates for autocompletion and validation. Java Specification Request (JSR) 175 introduced the annotation concept in the Java language, with type-safe and declared interfaces for the definition of annotations. Autocompletion and compile-time checking are no longer an issue. We found that annotation metadata is, compared to XDoclet, nonverbose and that it
126
CHAPTER 3
Domain models and metadata
has better defaults. However, JDK 5.0 annotations are sometimes more difficult to read than XDoclet annotations, because they aren’t inside regular comment blocks; you should use an IDE that supports configurable syntax highlighting of annotations. Other than that, we found no serious disadvantage in working with annotations in our daily work in the past years, and we consider annotation-metadata support to be one of the most important features of JDK 5.0. We’ll now introduce mapping annotations and use JDK 5.0. If you have to work with JDK 1.4 but like to use annotation-based metadata, consider XDoclet, which we’ll show afterwards. Defining and using annotations Before you annotate the first persistent class, let’s see how annotations are created. Naturally, you’ll usually use predefined annotations. However, knowing how to extend the existing metadata format or how to write your own annotations is a useful skill. The following code example shows the definition of an Entity annotation:
package javax.persistence; @Target(TYPE) @Retention(RUNTIME) public @interface Entity { String name() default ""; }
The first line defines the package, as always. This annotation is in the package javax.persistence, the Java Persistence API as defined by EJB 3.0. It’s one of the most important annotations of the specification—you can apply it on a POJO to make it a persistent entity class. The next line is an annotation that adds metainformation to the @Entity annotation (metadata about metadata). It specifies that the @Entity annotation can only be put on type declarations; in other words, you can only mark up classes with the @Entity annotation, not fields or methods. The retention policy chosen for this annotation is RUNTIME; other options (for other use cases) include removal of the annotation metadata during compilation, or only inclusion in byte-code without possible runtime reflectivity. You want to preserve all entity meta-information even at runtime, so Hibernate can read it on startup through Java Reflection. What follows in the example is the actual declaration of the annotation, including its interface name and its attributes (just one in this case, name, with an empty string default). Let’s use this annotation to make a POJO persistent class a Java Persistence entity:
Object/relational mapping metadata
127
package auction.model; import javax.persistence.*; @Entity @Table(name = "ITEM") public class Item { ... }
This public class, Item, has been declared as a persistent entity. All of its properties are now automatically persistent with a default strategy. Also shown is a second annotation that declares the name of the table in the database schema this persistent class is mapped to. If you omit this information, the JPA provider defaults to the unqualified class name (just as Hibernate will if you omit the table name in an XML mapping file). All of this is type-safe, and declared annotations are read with Java Reflection when Hibernate starts up. You don’t need to write any XML mapping files, Hibernate doesn’t need to parse any XML, and startup is faster. Your IDE can also easily validate and highlight annotations—they are regular Java types, after all. One of the clear benefits of annotations is their flexibility for agile development. If you refactor your code, you rename, delete, or move classes and properties all the time. Most development tools and editors can’t refactor XML element and attribute values, but annotations are part of the Java language and are included in all refactoring operations. Which annotations should you apply? You have the choice among several standardized and vendor-specific packages. Considering standards Annotation-based metadata has a significant impact on how you write Java applications. Other programming environments, like C# and .NET, had this kind of support for quite a while, and developers adopted the metadata attributes quickly. In the Java world, the big rollout of annotations is happening with Java EE 5.0. All specifications that are considered part of Java EE, like EJB, JMS, JMX, and even the servlet specification, will be updated and use JDK 5.0 annotations for metadata needs. For example, web services in J2EE 1.4 usually require significant metadata in XML files, so we expect to see real productivity improvements with annotations. Or, you can let the web container inject an EJB handle into your servlet, by adding an annotation on a field. Sun initiated a specification effort (JSR 250) to take care of the annotations across specifications, defining common annotations for the
128
CHAPTER 3
Domain models and metadata
whole Java platform. For you, however, working on a persistence layer, the most important specification is EJB 3.0 and JPA. Annotations from the Java Persistence package are available in javax.persistence once you have included the JPA interfaces in your classpath. You can use these annotations to declare persistent entity classes, embeddable classes (we’ll discuss these in the next chapter), properties, fields, keys, and so on. The JPA specification covers the basics and most relevant advanced mappings—everything you need to write a portable application, with a pluggable, standardized persistence layer that works inside and outside of any runtime container. What annotations and mapping features aren’t specified in Java Persistence? A particular JPA engine and product may naturally offer advantages—the so-called vendor extensions. Utilizing vendor extensions Even if you map most of your application’s model with JPA-compatible annotations from the javax.persistence package, you’ll have to use vendor extensions at some point. For example, almost all performance-tuning options you’d expect to be available in high-quality persistence software, such as fetching and caching settings, are only available as Hibernate-specific annotations. Let’s see what that looks like in an example. Annotate the Item entity source code again:
package auction.model; import javax.persistence.*; @Entity @Table(name = "ITEM") @org.hibernate.annotations.BatchSize(size = 10) @org.hibernate.annotations.DiscriminatorFormula( "case when ITEM_IS_SPECIAL is not null then A else B end" ) public class Item { ... }
This example contains two Hibernate annotations. The first, @BatchSize, is a fetching option that can increase performance in situations we’ll examine later in this book. The second, @DiscriminatorFormula, is a Hibernate mapping annotation that is especially useful for legacy schemas when class inheritance can’t be determined with simple literal values (here it maps a legacy column ITEM_IS_SPECIAL—probably some kind of flag—to a literal value). Both annotations are prefixed with the org.hibernate.annotations package name.
Object/relational mapping metadata
129
Consider this a good practice, because you can now easily see what metadata of this entity class is from the JPA specification and which tags are vendor-specific. You can also easily search your source code for “org.hibernate.annotations” and get a complete overview of all nonstandard annotations in your application in a single search result. If you switch your Java Persistence provider, you only have to replace the vendor-specific extensions, and you can expect a similar feature set to be available with most sophisticated solutions. Of course, we hope you’ll never have to do this, and it doesn’t happen often in practice—just be prepared. Annotations on classes only cover metadata that is applicable for that particular class. However, you often need metadata at a higher level, for a whole package or even the whole application. Before we discuss these options, we’d like to introduce another mapping metadata format. XML descriptors in JPA and EJB 3.0 The EJB 3.0 and Java Persistence standard embraces annotations aggressively. However, the expert group has been aware of the advantages of XML deployment descriptors in certain situations, especially for configuration metadata that changes with each deployment. As a consequence, every annotation in EJB 3.0 and JPA can be replaced with an XML descriptor element. In other words, you don’t have to use annotations if you don’t want to (although we strongly encourage you to reconsider and give annotations a try, if this is your first reaction to annotations). Let’s look at an example of a JPA XML descriptor for a particular persistence unit:
MY_SCHEMA MY_CATALOG
130
CHAPTER 3
Domain models and metadata
auction.model
This XML is automatically picked up by the JPA provider if you place it in a file called orm.xml in your classpath, in the META-INF directory of the persistence unit. You can see that you only have to name an identifier property for a class; as in annotations, all other properties of the entity class are automatically considered persistent with a sensible default mapping. You can also set default mappings for the whole persistence unit, such as the schema name and default cascading options. If you include the element, the JPA provider completely ignores all annotations on your entity classes in this persistence unit and relies only on the mappings as defined in the orm.xml file. You can (redundantly in this case) enable this on an entity level, with metadata-complete="true". If enabled, the JPA provider assumes that all properties of the entity are mapped in XML, and that all annotations for this entity should be ignored. If you don’t want to ignore but instead want to override the annotation metadata, first remove the global element from the orm.xml file. Also remove the metadata-complete="true" attribute from any entity mapping that should override, not replace, annotations:
auction.model
Here you map the initialPrice property to the INIT_PRICE column and specify it isn’t nullable. Any annotation on the initialPrice property of the Item class is
Object/relational mapping metadata
131
ignored, but all other annotations on the Item class are still applied. Also note that you didn’t specify an access strategy in this mapping, so field or accessor method access is used depending on the position of the @Id annotation in Item. (We’ll get back to this detail in the next chapter.) An obvious problem with XML deployment descriptors in Java Persistence is their compatibility with native Hibernate XML mapping files. The two formats aren’t compatible at all, and you should make a decision to use one or the other. The syntax of the JPA XML descriptor is much closer to the actual JPA annotations than to the native Hibernate XML mapping files. You also need to consider vendor extensions when you make a decision for an XML metadata format. The Hibernate XML format supports all possible Hibernate mappings, so if something can’t be mapped in JPA/Hibernate annotations, it can be mapped with native Hibernate XML files. The same isn’t true with JPA XML descriptors—they only provide convenient externalized metadata that covers the specification. Sun does not allow vendor extensions with an additional namespace. On the other hand, you can’t override annotations with Hibernate XML mapping files; you have to define a complete entity class mapping in XML. For these reasons, we don’t show all possible mappings in all three formats; we focus on native Hibernate XML metadata and JPA/Hibernate annotations. However, you’ll learn enough about the JPA XML descriptor to use it if you want to. Consider JPA/Hibernate annotations the primary choice if you’re using JDK 5.0. Fall back to native Hibernate XML mapping files if you want to externalize a particular class mapping or utilize a Hibernate extension that isn’t available as an annotation. Consider JPA XML descriptors only if you aren’t planning to use any vendor extension (which is, in practice, unlikely), or if you want to only override a few annotations, or if you require complete portability that even includes deployment descriptors. But what if you’re stuck with JDK 1.4 (or even 1.3) and still want to benefit from the better refactoring capabilities and reduced lines of code of inline metadata?
3.3.3
Using XDoclet
The XDoclet project has brought the notion of attribute-oriented programming to Java. XDoclet leverages the Javadoc tag format (@attribute) to specify class-, field-, or method-level metadata attributes. There is even a book about XDoclet from Manning Publications, XDoclet in Action (Walls and Richards, 2004). XDoclet is implemented as an Ant task that generates Hibernate XML metadata (or something else, depending on the plug-in) as part of the build process.
132
CHAPTER 3
Domain models and metadata
Creating the Hibernate XML mapping document with XDoclet is straightforward; instead of writing it by hand, you mark up the Java source code of your persistent class with custom Javadoc tags, as shown in listing 3.5.
Listing 3.5 Using XDoclet tags to mark up Java classes with mapping metadata
/** * The Category class of the CaveatEmptor auction site domain model. * * @hibernate.class * table="CATEGORY" */ public class Category { ... /** * @hibernate.id * generator-class="native" * column="CATEGORY_ID" */ public Long getId() { return id; } ... /** * @hibernate.property */ public String getName() { return name; } ... }
With the annotated class in place and an Ant task ready, you can automatically generate the same XML document shown in the previous section (listing 3.4). The downside to XDoclet is that it requires another build step. Most large Java projects are using Ant already, so this is usually a nonissue. Arguably, XDoclet mappings are less configurable at deployment time; but there is nothing stopping you from hand-editing the generated XML before deployment, so this is probably not a significant objection. Finally, support for XDoclet tag validation may not be available in your development environment. However, the latest IDEs support at least autocompletion of tag names. We won’t cover XDoclet in this book, but you can find examples on the Hibernate website.
Object/relational mapping metadata
133
Whether you use XML files, JDK 5.0 annotations, or XDoclet, you’ll often notice that you have to duplicate metadata in several places. In other words, you need to add global information that is applicable to more than one property, more than one persistent class, or even the whole application.
3.3.4
Handling global metadata
Consider the following situation: All of your domain model persistent classes are in the same package. However, you have to specify class names fully qualified, including the package, in every XML mapping file. It would be a lot easier to declare the package name once and then use only the short persistent class name. Or, instead of enabling direct field access for every single property through the access="field" mapping attribute, you’d rather use a single switch to enable field access for all properties. Class- or package-scoped metadata would be much more convenient. Some metadata is valid for the whole application. For example, query strings can be externalized to metadata and called by a globally unique name in the application code. Similarly, a query usually isn’t related to a particular class, and sometimes not even to a particular package. Other application-scoped metadata includes user-defined mapping types (converters) and data filter (dynamic view) definitions. Let’s walk through some examples of global metadata in Hibernate XML mappings and JDK 5.0 annotations. Global XML mapping metadata If you check the XML mapping DTD, you’ll see that the root element has global options that are applied to the class mapping(s) inside it—some of these options are shown in the following example:
...
The schema attribute enables a database schema prefix, AUCTION, used by Hibernate for all SQL statements generated for the mapped classes. By setting defaultlazy to false, you enable default outer-join fetching for some class associations, a
134
CHAPTER 3
Domain models and metadata
topic we’ll discuss in chapter 13, section 13.1, “Defining the global fetch plan.” (This default-lazy="true" switch has an interesting side effect: It switches to Hibernate 2.x default fetching behavior—useful if you migrate to Hibernate 3.x but don’t want to update all fetching settings.) With default-access, you enable direct field access by Hibernate for all persistent properties of all classes mapped in this file. Finally, the auto-import setting is turned off for all classes in this file. We’ll talk about importing and naming of entities in chapter 4, section 4.3, “Class mapping options.”
TIP
Mapping files with no class declarations—Global metadata is required and present in any sophisticated application. For example, you may easily import a dozen interfaces, or externalize a hundred query strings. In large-scale applications, you often create mapping files without actual class mappings, and only imports, external queries, or global filter and type definitions. If you look at the DTD, you can see that mappings are optional inside the root element. Split up and organize your global metadata into separate files, such as AuctionTypes.hbm.xml, AuctionQueries.hbm.xml, and so on, and load them in Hibernate’s configuration just like regular mapping files. However, make sure that all custom types and filters are loaded before any other mapping metadata that applies these types and filters to class mappings.
Let’s look at global metadata with JDK 5.0 annotations. Global annotation metadata Annotations are by nature woven into the Java source code for a particular class. Although it’s possible to place global annotations in the source file of a class (at the top), we’d rather keep global metadata in a separate file. This is called package metadata, and it’s enabled with a file named package-info.java in a particular package directory:
@org.hibernate.annotations.TypeDefs({ @org.hibernate.annotations.TypeDef( name="monetary_amount_usd", typeClass = MonetaryAmountType.class, parameters = { @Parameter(name="convertTo", value="USD") } ), @org.hibernate.annotations.TypeDef( name="monetary_amount_eur", typeClass = MonetaryAmountType.class, parameters = { @Parameter(name="convertTo", value="EUR") } ) })
Object/relational mapping metadata
135
@org.hibernate.annotations.NamedQueries({ @org.hibernate.annotations.NamedQuery( name = "findItemsOrderByPrice", query = "select i from Item i order by i.initialPrice)" ) }) package auction.persistence.types;
This example of a package metadata file, in the package auction.persistence.types, declares two Hibernate type converters. We’ll discuss the Hibernate type system in chapter 5, section 5.2, “The Hibernate type system.” You can now refer to the user-defined types in class mappings by their names. The same mechanism can be used to externalize queries and to define global identifier generators (not shown in the last example). There is a reason the previous code example only includes annotations from the Hibernate package and no Java Persistence annotations. One of the (lastminute) changes made to the JPA specification was the removal of package visibility of JPA annotations. As a result, no Java Persistence annotations can be placed in a package-info.java file. If you need portable global Java Persistence metadata, put it in an orm.xml file. Note that you have to name a package that contains a metadata file in your Hibernate or JPA persistence unit configuration if you aren’t using automatic detection—see chapter 2, section 2.2.1, “Using Hibernate Annotations.” Global annotations (Hibernate and JPA) can also be placed in the source code of a particular class, right after the import section. The syntax for the annotations is the same as in the package-info.java file, so we won’t repeat it here. You now know how to write local and global mapping metadata. Another issue in large-scale applications is the portability of metadata. Using placeholders In any larger Hibernate application, you’ll face the problem of native code in your mapping metadata—code that effectively binds your mapping to a particular database product. For example, SQL statements, such as in formula, constraint, or filter mappings, aren’t parsed by Hibernate but are passed directly through to the database management system. The advantage is flexibility—you can call any native SQL function or keyword your database system supports. The disadvantage of putting native SQL in your mapping metadata is lost database portability, because your mappings, and hence your application, will work only for a particular DBMS (or even DBMS version).
136
CHAPTER 3
Domain models and metadata
Even simple things, such as primary key generation strategies, usually aren’t portable across all database systems. In the next chapter, we discuss a special identifier generator called native, which is a built-in smart primary key generator. On Oracle, it uses a database sequence to generate primary key values for rows in a table; on IBM DB2, it uses a special identity primary key column by default. This is how you map it in XML:
...
We’ll discuss the details of this mapping later. The interesting part is the declaration class="native" as the identifier generator. Let’s assume that the portability this generator provides isn’t what you need, perhaps because you use a custom identifier generator, a class you wrote that implements the Hibernate IdentifierGenerator interface:
The XML mapping file is now bound to a particular database product, and you lose the database portability of the Hibernate application. One way to deal with this issue is to use a placeholder in your XML file that is replaced during build when the mapping files are copied to the target directory (Ant supports this). This mechanism is recommended only if you have experience with Ant or already need build-time substitution for other parts of your application. A much more elegant variation is to use custom XML entities (not related to our application’s business entities). Let’s assume you need to externalize an element or attribute value in your XML files to keep it portable:
The &idgenerator; value is called an entity placeholder. You can define its value at the top of the XML file as an entity declaration, as part of the document type definition:
]>
The XML parser will now substitute the placeholder on Hibernate startup, when mapping files are read. You can take this one step further and externalize this addition to the DTD in a separate file and include the global options in all other mapping files:
%globals; ]>
This example shows the inclusion of an external file as part of the DTD. The syntax, as often in XML, is rather crude, but the purpose of each line should be clear. All global settings are added to the globals.dtd file in the persistence package on the classpath:
To switch from Oracle to a different database system, just deploy a different globals.dtd file. Often, you need not only substitute an XML element or attribute value but also to include whole blocks of mapping metadata in all files, such as when many of your classes share some common properties, and you can’t use inheritance to capture them in a single location. With XML entity replacement, you can externalize an XML snippet to a separate file and include it in other XML files. Let’s assume all the persistent classes have a dateModified property. The first step is to put this mapping in its own file, say, DateModified.hbm.xml:
This file needs no XML header or any other tags. Now you include it in the mapping file for a persistent class:
&datemodified; ... SYSTEM "classpath://model/DateModified.hbm.xml">
The content of DateModified.hbm.xml will be included and be substituted for the &datemodified; placeholder. This, of course, also works with larger XML snippets. When Hibernate starts up and reads mapping files, XML DTDs have to be resolved by the XML parser. The built-in Hibernate entity resolver looks for the hibernate-mapping-3.0.dtd on the classpath; it should find the DTD in the hibernate3.jar file before it tries to look it up on the Internet, which happens automatically whenever an entity URL is prefixed with http://hibernate.sourceforge.net/. The Hibernate entity resolver can also detect the classpath:// prefix, and the resource is then searched for in the classpath, where you can copy it on deployment. We have to repeat this FAQ: Hibernate never looks up the DTD on the Internet if you have a correct DTD reference in your mapping and the right JAR on the classpath. The approaches we have described so far—XML, JDK 5.0 annotations, and XDoclet attributes—assume that all mapping information is known at development (or deployment) time. Suppose, however, that some information isn’t known before the application starts. Can you programmatically manipulate the mapping metadata at runtime?
3.3.5
Manipulating metadata at runtime
It’s sometimes useful for an application to browse, manipulate, or build new mappings at runtime. XML APIs like DOM, dom4j, and JDOM allow direct runtime manipulation of XML documents, so you could create or manipulate an XML document at runtime, before feeding it to the Configuration object. On the other hand, Hibernate also exposes a configuration-time metamodel that contains all the information declared in your static mapping metadata. Direct programmatic manipulation of this metamodel is sometimes useful, especially for applications that allow for extension by user-written code. A more drastic approach would be complete programmatic and dynamic definition of the mapping metadata, without any static mapping. However, this is exotic and
Object/relational mapping metadata
139
should be reserved for a particular class of fully dynamic applications, or application building kits. The following code adds a new property, motto, to the User class:
// Get the existing mapping for User from Configuration PersistentClass userMapping = cfg.getClassMapping(User.class.getName()); // Define a new column for the USER table Column column = new Column(); column.setName("MOTTO"); column.setNullable(false); column.setUnique(true); userMapping.getTable().addColumn(column); // Wrap the column in a Value SimpleValue value = new SimpleValue(); value.setTable( userMapping.getTable() ); value.setTypeName("string"); value.addColumn(column); // Define a new property of the User class Property prop = new Property(); prop.setValue(value); prop.setName("motto"); prop.setNodeName(prop.getName()); userMapping.addProperty(prop); // Build a new session factory, using the new mapping SessionFactory sf = cfg.buildSessionFactory();
A PersistentClass object represents the metamodel for a single persistent class, and you retrieve it from the Configuration object. Column, SimpleValue, and Property are all classes of the Hibernate metamodel and are available in the org.hibernate.mapping package.
TIP
Keep in mind that adding a property to an existing persistent class mapping, as shown here, is quite easy, but programmatically creating a new mapping for a previously unmapped class is more involved.
Once a SessionFactory is created, its mappings are immutable. The SessionFactory uses a different metamodel internally than the one used at configuration time. There is no way to get back to the original Configuration from the SessionFactory or Session. (Note that you can get the SessionFactory from a Session if you wish to access a global setting.) However, the application can read the SessionFactory’s metamodel by calling getClassMetadata() or getCollectionMetadata(). Here’s an example:
140
CHAPTER 3
Domain models and metadata
Item item = ...; ClassMetadata meta = sessionFactory.getClassMetadata(Item.class); String[] metaPropertyNames = meta.getPropertyNames(); Object[] propertyValues = meta.getPropertyValues(item, EntityMode.POJO);
This code snippet retrieves the names of persistent properties of the Item class and the values of those properties for a particular instance. This helps you write generic code. For example, you may use this feature to label UI components or improve log output. Although you’ve seen some mapping constructs in the previous sections, we haven’t introduced any more sophisticated class and property mappings so far. You should now decide which mapping metadata option you’d like to use in your project and then read more about class and property mappings in the next chapter. Or, if you’re already an experienced Hibernate user, you can read on and find out how the latest Hibernate version allows you to represent a domain model without Java classes.
3.4
Alternative entity representation
In this book, so far, we’ve always talked about a domain model implementation based on Java classes—we called them POJOs, persistent classes, JavaBeans, or entities. An implementation of a domain model that is based on Java classes with regular properties, collections, and so on, is type-safe. If you access a property of a class, your IDE offers autocompletion based on the strong types of your model, and the compiler checks whether your source is correct. However, you pay for this safety with more time spent on the domain model implementation—and time is money. In the following sections, we introduce Hibernate’s ability to work with domain models that aren’t implemented with Java classes. We’re basically trading typesafety for other benefits and, because nothing is free, more errors at runtime whenever we make a mistake. In Hibernate, you can select an entity mode for your application, or even mix entity modes for a single model. You can even switch between entity modes in a single Session. These are the three built-in entity modes in Hibernate:
■
POJO—A domain model implementation based on POJOs, persistent classes.
This is what you have seen so far, and it’s the default entity mode.
Alternative entity representation
141
■
MAP—No Java classes are required; entities are represented in the Java application with HashMaps. This mode allows quick prototyping of fully dynamic
applications.
■
DOM4J—No Java classes are required; entities are represented as XML elements, based on the dom4j API. This mode is especially useful for exporting or importing data, or for rendering and transforming data through XSLT processing.
There are two reasons why you may want to skip the next section and come back later: First, a static domain model implementation with POJOs is the common case, and dynamic or XML representation are features you may not need right now. Second, we’re going to present some mappings, queries, and other operations that you may not have seen so far, not even with the default POJO entity mode. However, if you feel confident enough with Hibernate, read on. Let’s start with the MAP mode and explore how a Hibernate application can be fully dynamically typed.
3.4.1
Creating dynamic applications
A dynamic domain model is a model that is dynamically typed. For example, instead of a Java class that represents an auction item, you work with a bunch of values in a Java Map. Each attribute of an auction item is represented by a key (the name of the attribute) and its value. Mapping entity names First, you need to enable this strategy by naming your business entities. In a Hibernate XML mapping file, you use the entity-name attribute:
142
CHAPTER 3
Domain models and metadata
There are three interesting things to observe in this mapping file. First, you mix several class mappings in one, something we didn’t recommend earlier. This time you aren’t really mapping Java classes, but logical names of entities. You don’t have a Java source file and an XML mapping file with the same name next to each other, so you’re free to organize your metadata in any way you like. Second, the attribute has been replaced with . You also append ...Entity to these logical names for clarity and to distinguish them from other nondynamic mappings that you made earlier with regular POJOs. Finally, all entity associations, such as and , now also refer to logical entity names. The class attribute in the association mappings is now entity-name. This isn’t strictly necessary—Hibernate can recognize that you’re referring to a logical entity name even if you use the class attribute. However, it avoids confusion when you later mix several representations. Let’s see what working with dynamic entities looks like. Working with dynamic maps To create an instance of one of your entities, you set all attribute values in a Java Map:
Map user = new HashMap(); user.put("username", "johndoe"); Map item1 = new HashMap(); item1.put("description", "An item for auction"); item1.put("initialPrice", new BigDecimal(99)); item1.put("seller", user);
Alternative entity representation
143
Map item2 = new HashMap(); item2.put("description", "Another item for auction"); item2.put("initialPrice", new BigDecimal(123)); item2.put("seller", user); Collection itemsForSale = new ArrayList(); itemsForSale.add(item1); itemsForSale.add(item2); user.put("itemsForSale", itemsForSale); session.save("UserEntity", user);
The first map is a UserEntity, and you set the username attribute as a key/value pair. The next two maps are ItemEntitys, and here you set the link to the seller of each item by putting the user map into the item1 and item2 maps. You’re effectively linking maps—that’s why this representation strategy is sometimes also called “representation with maps of maps.” The collection on the inverse side of the one-to-many association is initialized with an ArrayList, because you mapped it with bag semantics (Java doesn’t have a bag implementation, but the Collection interface has bag semantics). Finally, the save() method on the Session is given a logical entity name and the user map as an input parameter. Hibernate knows that UserEntity refers to the dynamically mapped entity, and that it should treat the input as a map that has to be saved accordingly. Hibernate also cascades to all elements in the itemsForSale collection; hence, all item maps are also made persistent. One UserEntity and two ItemEntitys are inserted into their respective tables.
FAQ
Can I map a Set in dynamic mode? Collections based on sets don’t work with dynamic entity mode. In the previous code example, imagine that itemsForSale was a Set. A Set checks its elements for duplicates, so when you call add(item1) and add(item2), the equals() method on these objects is called. However, item1 and item2 are Java Map instances, and the equals() implementation of a map is based on the key sets of the map. So, because both item1 and item2 are maps with the same keys, they aren’t distinct when added to a Set. Use bags or lists only if you require collections in dynamic entity mode.
Hibernate handles maps just like POJO instances. For example, making a map persistent triggers identifier assignment; each map in persistent state has an identifier attribute set with the generated value. Furthermore, persistent maps are automatically checked for any modifications inside a unit of work. To set a new price on an item, for example, you can load it and then let Hibernate do all the work:
144
CHAPTER 3
Domain models and metadata
Long storedItemId = (Long) item1.get("id"); Session session = getSessionFactory().openSession(); session.beginTransaction(); Map loadedItemMap = (Map) session.load("ItemEntity", storedItemId); loadedItemMap.put("initialPrice", new BigDecimal(100)); session.getTransaction().commit(); session.close();
All Session methods that have class parameters such as load() also come in an overloaded variation that accepts entity names. After loading an item map, you set a new price and make the modification persistent by committing the transaction, which, by default, triggers dirty checking and flushing of the Session. You can also refer to entity names in HQL queries:
List queriedItemMaps = session.createQuery("from ItemEntity where initialPrice >= :p") .setParameter("p", new BigDecimal(100)) .list();
This query returns a collection of ItemEntity maps. They are in persistent state. Let’s take this one step further and mix a POJO model with dynamic maps. There are two reasons why you would want to mix a static implementation of your domain model with a dynamic map representation:
■
You want to work with a static model based on POJO classes by default, but sometimes you want to represent data easily as maps of maps. This can be particularly useful in reporting, or whenever you have to implement a generic user interface that can represent various entities dynamically. You want to map a single POJO class of your model to several tables and then select the table at runtime by specifying a logical entity name.
■
You may find other use cases for mixed entity modes, but they’re so rare that we want to focus on the most obvious. First, therefore, you’ll mix a static POJO model and enable dynamic map representation for some of the entities, some of the time. Mixing dynamic and static entity modes To enable a mixed model representation, edit your XML mapping metadata and declare a POJO class name and a logical entity name:
... ...
Obviously, you also need the two classes, model.ItemPojo and model.UserPojo, that implement the properties of these entities. You still base the many-to-one and one-to-many associations between the two entities on logical names. Hibernate will primarily use the logical names from now on. For example, the following code does not work:
UserPojo user = new UserPojo(); ... ItemPojo item1 = new ItemPojo(); ... ItemPojo item2 = new ItemPojo(); ... Collection itemsForSale = new ArrayList(); ... session.save(user);
The preceding example creates a few objects, sets their properties, and links them, and then tries to save the objects through cascading by passing the user instance to save(). Hibernate inspects the type of this object and tries to figure out what entity it is, and because Hibernate now exclusively relies on logical entity names, it can’t find a mapping for model.UserPojo. You need to tell Hibernate the logical name when working with a mixed representation mapping:
... session.save("UserEntity", user);
Once you change this line, the previous code example works. Next, consider loading, and what is returned by queries. By default, a particular SessionFactory
146
CHAPTER 3
Domain models and metadata
is in POJO entity mode, so the following operations return instances of model.ItemPojo:
Long storedItemId = item1.getId(); ItemPojo loadedItemPojo = (ItemPojo) session.load("ItemEntity", storedItemId); List queriedItemPojos = session.createQuery("from ItemEntity where initialPrice >= :p") .setParameter("p", new BigDecimal(100)) .list();
You can switch to a dynamic map representation either globally or temporarily, but a global switch of the entity mode has serious consequences. To switch globally, add the following to your Hibernate configuration; e.g., in hibernate.cfg.xml:
dynamic-map
All Session operations now either expect or return dynamically typed maps! The previous code examples that stored, loaded, and queried POJO instances no longer work; you need to store and load maps. It’s more likely that you want to switch to another entity mode temporarily, so let’s assume that you leave the SessionFactory in the default POJO mode. To switch to dynamic maps in a particular Session, you can open up a new temporary Session on top of the existing one. The following code uses such a temporary Session to store a new auction item for an existing seller:
Session dynamicSession = session.getSession(EntityMode.MAP); Map seller = (Map) dynamicSession.load("UserEntity", user.getId() ); Map newItemMap = new HashMap(); newItemMap.put("description", "An item for auction"); newItemMap.put("initialPrice", new BigDecimal(99)); newItemMap.put("seller", seller); dynamicSession.save("ItemEntity", newItemMap); Long storedItemId = (Long) newItemMap.get("id"); Map loadedItemMap = (Map) dynamicSession.load("ItemEntity", storedItemId); List queriedItemMaps = dynamicSession .createQuery("from ItemEntity where initialPrice >= :p") .setParameter("p", new BigDecimal(100)) .list();
The temporary dynamicSession that is opened with getSession() doesn’t need to be flushed or closed; it inherits the context of the original Session. You use it
Alternative entity representation
147
only to load, query, or save data in the chosen representation, which is the EntityMode.MAP in the previous example. Note that you can’t link a map with a POJO instance; the seller reference has to be a HashMap, not an instance of UserPojo. We mentioned that another good use case for logical entity names is the mapping of one POJO to several tables, so let’s look at that. Mapping a class several times Imagine that you have several tables with some columns in common. For example, you could have ITEM_AUCTION and ITEM_SALE tables. Usually you map each table to an entity persistent class, ItemAuction and ItemSale respectively. With the help of entity names, you can save work and implement a single persistent class. To map both tables to a single persistent class, use different entity names (and usually different property mappings):
... ...
The model.Item persistent class has all the properties you mapped: id, description, initialPrice, and salesPrice. Depending on the entity name you use at runtime, some properties are considered persistent and others transient:
Item itemForAuction = new Item(); itemForAuction.setDescription("An item for auction"); itemForAuction.setInitialPrice( new BigDecimal(99) ); session.save("ItemAuction", itemForAuction); Item itemForSale = new Item(); itemForSale.setDescription("An item for sale");
148
CHAPTER 3
Domain models and metadata
itemForSale.setSalesPrice( new BigDecimal(123) ); session.save("ItemSale", itemForSale);
Thanks to the logical entity name, Hibernate knows into which table it should insert the data. Depending on the entity name you use for loading and querying entities, Hibernate selects from the appropriate table. Scenarios in which you need this functionality are rare, and you’ll probably agree with us that the previous use case isn’t good or common. In the next section, we introduce the third built-in Hibernate entity mode, the representation of domain entities as XML documents.
3.4.2
Representing data in XML
XML is nothing but a text file format; it has no inherent capabilities that qualify it as a medium for data storage or data management. The XML data model is weak, its type system is complex and underpowered, its data integrity is almost completely procedural, and it introduces hierarchical data structures that were outdated decades ago. However, data in XML format is attractive to work with in Java; we have nice tools. For example, we can transform XML data with XSLT, which we consider one of the best use cases. Hibernate has no built-in functionality to store data in an XML format; it relies on a relational representation and SQL, and the benefits of this strategy should be clear. On the other hand, Hibernate can load and present data to the application developer in an XML format. This allows you to use a sophisticated set of tools without any additional transformation steps. Let’s assume that you work in default POJO mode and that you quickly want to obtain some data represented in XML. Open a temporary Session with the EntityMode.DOM4J:
Session dom4jSession = session.getSession(EntityMode.DOM4J); Element userXML = (Element) dom4jSession.load(User.class, storedUserId);
What is returned here is a dom4j Element, and you can use the dom4j API to read and manipulate it. For example, you can pretty-print it to your console with the following snippet:
try { OutputFormat format = OutputFormat.createPrettyPrint(); XMLWriter writer = new XMLWriter( System.out, format); writer.write( userXML ); } catch (IOException ex) { throw new RuntimeException(ex); }
Alternative entity representation
149
If we assume that you reuse the POJO classes and data from the previous examples, you see one User instance and two Item instances (for clarity, we no longer name them UserPojo and ItemPojo):
1 johndoe - 2 99 An item for auction 1
- 3 123 Another item for auction 1
Hibernate assumes default XML element names—the entity and property names. You can also see that collection elements are embedded, and that circular references are resolved through identifiers (the element). You can change this default XML representation by adding node attributes to your Hibernate mapping metadata:
Each node attribute defines the XML representation:
■
A node="name" attribute on a mapping defines the name of the XML element for that entity. A node="name" attribute on any property mapping specifies that the property content should be represented as the text of an XML element of the given name. A node="@name" attribute on any property mapping specifies that the property content should be represented as an XML attribute value of the given name. A node="name/@attname" attribute on any property mapping specifies that the property content should be represented as an XML attribute value of the given name, on a child element of the given name.
■
■
■
The embed-xml option is used to trigger embedding or referencing of associated entity data. The updated mapping results in the following XML representation of the same data you’ve seen before:
-
Alternative entity representation
151
-
Be careful with the embed-xml option—you can easily create circular references that result in an endless loop! Finally, data in an XML representation is transactional and persistent, so you can modify queried XML elements and let Hibernate take care of updating the underlying tables:
Element itemXML = (Element) dom4jSession.get(Item.class, storedItemId); itemXML.element("item-details") .attribute("initial-price") .setValue("100"); session.flush(); // Hibernate executes UPDATEs Element userXML = (Element) dom4jSession.get(User.class, storedUserId); Element newItem = DocumentHelper.createElement("item"); Element newItemDetails = newItem.addElement("item-details"); newItem.addAttribute("seller-id", userXml.attribute("id").getValue() ); newItemDetails.addAttribute("initial-price", "123"); newItemDetails.addAttribute("description", "A third item"); dom4jSession.save(Item.class.getName(), newItem); dom4jSession.flush(); // Hibernate executes INSERTs
There is no limit to what you can do with the XML that is returned by Hibernate. You can display, export, and transform it in any way you like. See the dom4j documentation for more information. Finally, note that you can use all three built-in entity modes simultaneously, if you like. You can map a static POJO implementation of your domain model, switch to dynamic maps for your generic user interface, and export data into XML. Or, you can write an application that doesn’t have any domain classes, only dynamic maps and XML. We have to warn you, though, that prototyping in the software industry often means that customers end up with the prototype that nobody wanted to throw away—would you buy a prototype car? We highly recommend that you rely on static domain models if you want to create a maintainable system.
152
CHAPTER 3
Domain models and metadata
We won’t consider dynamic models or XML representation again in this book. Instead, we’ll focus on static persistent classes and how they are mapped.
3.5
Summary
In this chapter, we focused on the design and implementation of a rich domain model in Java. You now understand that persistent classes in a domain model should to be free of crosscutting concerns, such as transactions and security. Even persistencerelated concerns should not leak into the domain model implementation. You also know how important transparent persistence is if you want to execute and test your business objects independently and easily. You have learned the best practices and requirements for the POJO and JPA entity programming model, and what concepts they have in common with the old JavaBean specification. We had a closer look at the implementation of persistent classes, and how attributes and relationships are best represented. To be prepared for the next part of the book, and to learn all the object/relational mapping options, you needed to make an educated decision to use either XML mapping files or JDK 5.0 annotations, or possibly a combination of both. You’re now ready to write more complex mappings in both formats. For convenience, table 3.1 summarizes the differences between Hibernate and Java Persistence related to concepts discussed in this chapter.
Table 3.1 Hibernate and JPA comparison chart for chapter 3 Hibernate Core Persistent classes require a no-argument constructor with public or protected visibility if proxybased lazy loading is used. Persistent collections must be typed to interfaces. Hibernate supports all JDK interfaces. Java Persistence and EJB 3.0 The JPA specification mandates a no-argument constructor with public or protected visibility for all entity classes. Persistent collections must be typed to interfaces. Only a subset of all interfaces (no sorted collections, for example) is considered fully portable. Persistent properties of an entity class are accessed through fields or accessor methods, but not both if full portability is required.
Persistent properties can be accessed through fields or accessor methods at runtime, or a completely customizable strategy can be applied.
Summary
153
Table 3.1
Hibernate and JPA comparison chart for chapter 3 (continued) Hibernate Core Java Persistence and EJB 3.0 JPA annotations cover all basic and most advanced mapping options. Hibernate Annotations are required for exotic mappings and tuning. Global metadata is only fully portable if declared in the standard orm.xml metadata file.
The XML metadata format supports all possible Hibernate mapping options.
XML mapping metadata can be defined globally, and XML placeholders are used to keep metadata free from dependencies.
In the next part of the book, we show you all possible basic and some advanced mapping techniques, for classes, properties, inheritance, collections, and associations. You’ll learn how to solve the structural object/relational mismatch.
Part 2 Mapping concepts and strategies
his part is all about actual object/relational mapping, from classes and properties to tables and columns. Chapter 4 starts with regular class and property mappings, and explains how you can map fine-grained Java domain models. Next, in chapter 5, you’ll see how to map more complex class inheritance hierarchies and how to extend Hibernate's functionality with the powerful custom mapping type system. In chapters 6 and 7, we show you how to map Java collections and associations between classes, with many sophisticated examples. Finally, you’ll find chapter 8 most interesting if you need to introduce Hibernate in an existing applications, or if you have to work with legacy database schemas and hand-written SQL. We also talk about customized SQL DDL for schema generation in this chapter. After reading this part of the book, you’ll be ready to create even the most complex mappings quickly and with the right strategy. You’ll understand how the problem of inheritance mapping can be solved, and how collections and associations can be mapped. You’ll also be able to tune and customize Hibernate for integration with any existing database schema or application.
T
Mapping persistent classes
This chapter covers
■ ■ ■
Understanding the entity and value-type concept Mapping classes with XML and annotations Fine-grained property and component mappings
157
158
CHAPTER 4
Mapping persistent classes
This chapter presents the fundamental mapping options, explaining how classes and properties are mapped to tables and columns. We show and discuss how you can handle database identity and primary keys, and how various other metadata settings can be used to customize how Hibernate loads and stores objects. All mapping examples are done in Hibernate’s native XML format, and with JPA annotations and XML descriptors, side by side. We also look closely at the mapping of fine-grained domain models, and at how properties and embedded components are mapped. First, though, we define the essential distinction between entities and value types, and explain how you should approach the object/relational mapping of your domain model.
4.1
Understanding entities and value types
Entities are persistent types that represent first-class business objects (the term object is used here in its natural sense). In other words, some of the classes and types you have to deal with in an application are more important, which naturally makes others less important. You probably agree that in CaveatEmptor, Item is a more important class than String. User is probably more important than Address. What makes something important? Let’s look at the issue from a different perspective.
4.1.1
Fine-grained domain models
A major objective of Hibernate is support for fine-grained domain models, which we isolated as the most important requirement for a rich domain model. It’s one reason why we work with POJOs. In crude terms, fine-grained means more classes than tables. For example, a user may have both a billing address and a home address. In the database, you may have a single USERS table with the columns BILLING_STREET, BILLING_CITY, and BILLING_ZIPCODE, along with HOME_STREET, HOME_CITY, and HOME_ZIPCODE. (Remember the problem of SQL types we discussed in chapter 1?) In the domain model, you could use the same approach, representing the two addresses as six string-valued properties of the User class. But it’s much better to model this using an Address class, where User has the billingAddress and homeAddress properties, thus using three classes for one table. This domain model achieves improved cohesion and greater code reuse, and it’s more understandable than SQL systems with inflexible type systems. In
Understanding entities and value types
159
the past, many ORM solutions didn’t provide especially good support for this kind of mapping. Hibernate emphasizes the usefulness of fine-grained classes for implementing type safety and behavior. For example, many people model an email address as a string-valued property of User. A more sophisticated approach is to define an EmailAddress class, which adds higher-level semantics and behavior—it may provide a sendEmail() method. This granularity problem leads us to a distinction of central importance in ORM. In Java, all classes are of equal standing—all objects have their own identity and lifecycle. Let’s walk through an example.
4.1.2
Defining the concept
Two people live in the same apartment, and they both register user accounts in CaveatEmptor. Naturally, each account is represented by one instance of User, so you have two entity instances. In the CaveatEmptor model, the User class has a homeAddress association with the Address class. Do both User instances have a runtime reference to the same Address instance or does each User instance have a reference to its own Address? If Address is supposed to support shared runtime references, it’s an entity type. If not, it’s likely a value type and hence is dependent on a single reference by an owning entity instance, which also provides identity. We advocate a design with more classes than tables: One row represents multiple instances. Because database identity is implemented by primary key value, some persistent objects won’t have their own identity. In effect, the persistence mechanism implements pass-by-value semantics for some classes! One of the objects represented in the row has its own identity, and others depend on that. In the previous example, the columns in the USERS table that contain address information are dependent on the identifier of the user, the primary key of the table. An instance of Address is dependent on an instance of User. Hibernate makes the following essential distinction:
■
An object of entity type has its own database identity (primary key value). An object reference to an entity instance is persisted as a reference in the database (a foreign key value). An entity has its own lifecycle; it may exist independently of any other entity. Examples in CaveatEmptor are User, Item, and Category. An object of value type has no database identity; it belongs to an entity instance and its persistent state is embedded in the table row of the owning
■
160
CHAPTER 4
Mapping persistent classes
entity. Value types don’t have identifiers or identifier properties. The lifespan of a value type instance is bounded by the lifespan of the owning entity instance. A value type doesn’t support shared references: If two users live in the same apartment, they each have a reference to their own homeAddress instance. The most obvious value types are classes like Strings and Integers, but all JDK classes are considered value types. User-defined classes can also be mapped as value types; for example, CaveatEmptor has Address and MonetaryAmount. Identification of entities and value types in your domain model isn’t an ad hoc task but follows a certain procedure.
4.1.3
Identifying entities and value types
You may find it helpful to add stereotype information to your UML class diagrams so you can immediately see and distinguish entities and value types. This practice also forces you to think about this distinction for all your classes, which is a first step to an optimal mapping and well-performing persistence layer. See figure 4.1 for an example. The Item and User classes are obvious entities. They each have their own identity, their instances have references from many other instances (shared references), and they have independent lifecycles. Identifying the Address as a value type is also easy: A particular Address instance is referenced by only a single User instance. You know this because the association has been created as a composition, where the User instance has been made fully responsible for the lifecycle of the referenced Address instance. Therefore, Address objects can’t be referenced by anyone else and don’t need their own identity. The Bid class is a problem. In object-oriented modeling, you express a composition (the association between Item and Bid with the diamond), and an Item manages the lifecycles of all the Bid objects to which it has a reference (it’s a collection of references). This seems reasonable, because the bids would be useless if
Figure 4.1
Stereotypes for entities and value types have been added to the diagram.
Mapping entities with identity
161
an Item no longer existed. But at the same time, there is another association to Bid: An Item may hold a reference to its successfulBid. The successful bid must also be one of the bids referenced by the collection, but this isn’t expressed in the UML. In any case, you have to deal with possible shared references to Bid instances, so the Bid class needs to be an entity. It has a dependent lifecycle, but it must have its own identity to support shared references. You’ll often find this kind of mixed behavior; however, your first reaction should be to make everything a value-typed class and promote it to an entity only when absolutely necessary. Try to simplify your associations: Collections, for example, sometimes add complexity without offering any advantages. Instead of mapping a persistent collection of Bid references, you can write a query to obtain all the bids for an Item (we’ll come back to this point again in chapter 7). As the next step, take your domain model diagram and implement POJOs for all entities and value types. You have to take care of three things:
■
Shared references—Write your POJO classes in a way that avoids shared references to value type instances. For example, make sure an Address object can be referenced by only one User. For example, make it immutable and enforce the relationship with the Address constructor. Lifecycle dependencies—As discussed, the lifecycle of a value-type instance is bound to that of its owning entity instance. If a User object is deleted, its Address dependent object(s) have to be deleted as well. There is no notion or keyword for this in Java, but your application workflow and user interface must be designed to respect and expect lifecycle dependencies. Persistence metadata includes the cascading rules for all dependencies. Identity—Entity classes need an identifier property in almost all cases. Userdefined value-type classes (and JDK classes) don’t have an identifier property, because instances are identified through the owning entity.
■
■
We’ll come back to class associations and lifecycle rules when we discuss more advanced mappings later in the book. However, object identity is a subject you have to understand at this point.
4.2
Mapping entities with identity
It’s vital to understand the difference between object identity and object equality before we discuss terms like database identity and the way Hibernate manages identity. Next, we explore how object identity and equality relate to database (primary key) identity.
162
CHAPTER 4
Mapping persistent classes
4.2.1
Understanding Java identity and equality
Java developers understand the difference between Java object identity and equality. Object identity, ==, is a notion defined by the Java virtual machine. Two object references are identical if they point to the same memory location. On the other hand, object equality is a notion defined by classes that implement the equals() method, sometimes also referred to as equivalence. Equivalence means that two different (nonidentical) objects have the same value. Two different instances of String are equal if they represent the same sequence of characters, even though they each have their own location in the memory space of the virtual machine. (If you’re a Java guru, we acknowledge that String is a special case. Assume we used a different class to make the same point.) Persistence complicates this picture. With object/relational persistence, a persistent object is an in-memory representation of a particular row of a database table. Along with Java identity (memory location) and object equality, you pick up database identity (which is the location in the persistent data store). You now have three methods for identifying objects:
■
Objects are identical if they occupy the same memory location in the JVM. This can be checked by using the == operator. This concept is known as object identity. Objects are equal if they have the same value, as defined by the equals(Object o) method. Classes that don’t explicitly override this method inherit the implementation defined by java.lang.Object, which compares object identity. This concept is known as equality. Objects stored in a relational database are identical if they represent the same row or, equivalently, if they share the same table and primary key value. This concept is known as database identity.
■
■
We now need to look at how database identity relates to object identity in Hibernate, and how database identity is expressed in the mapping metadata.
4.2.2
Handling database identity
Hibernate exposes database identity to the application in two ways:
■ ■
The value of the identifier property of a persistent instance The value returned by Session.getIdentifier(Object entity)
Mapping entities with identity
163
Adding an identifier property to entities The identifier property is special—its value is the primary key value of the database row represented by the persistent instance. We don’t usually show the identifier property in the domain model diagrams. In the examples, the identifier property is always named id. If myCategory is an instance of Category, calling myCategory.getId() returns the primary key value of the row represented by myCategory in the database. Let’s implement an identifier property for the Category class:
public class Category { private Long id; ... public Long getId() { return this.id; } private void setId(Long id) { this.id = id; } ... }
Should you make the accessor methods for the identifier property private scope or public? Well, database identifiers are often used by the application as a convenient handle to a particular instance, even outside the persistence layer. For example, it’s common for web applications to display the results of a search screen to the user as a list of summary information. When the user selects a particular element, the application may need to retrieve the selected object, and it’s common to use a lookup by identifier for this purpose—you’ve probably already used identifiers this way, even in applications that rely on JDBC. It’s usually appropriate to fully expose the database identity with a public identifier property accessor. On the other hand, you usually declare the setId() method private and let Hibernate generate and set the identifier value. Or, you map it with direct field access and implement only a getter method. (The exception to this rule is classes with natural keys, where the value of the identifier is assigned by the application before the object is made persistent instead of being generated by Hibernate. We discuss natural keys in chapter 8.) Hibernate doesn’t allow you to change the identifier value of a persistent instance after it’s first assigned. A primary key value never changes—otherwise the attribute wouldn’t be a suitable primary key candidate!
164
CHAPTER 4
Mapping persistent classes
The Java type of the identifier property, java.lang.Long in the previous example, depends on the primary key type of the CATEGORY table and how it’s mapped in Hibernate metadata. Mapping the identifier property A regular (noncomposite) identifier property is mapped in Hibernate XML files with the element:
...
The identifier property is mapped to the primary key column CATEGORY_ID of the table CATEGORY. The Hibernate type for this property is long, which maps to a BIGINT column type in most databases and which has also been chosen to match the type of the identity value produced by the native identifier generator. (We discuss identifier generation strategies in the next section.) For a JPA entity class, you use annotations in the Java source code to map the identifier property:
@Entity @Table(name="CATEGORY") public class Category { private Long id; ... @Id @GeneratedValue(strategy = GenerationType.AUTO) @Column(name = "CATEGORY_ID") public Long getId() { return this.id; } private void setId(Long id) { this.id = id; } ... }
The @Id annotation on the getter method marks it as the identifier property, and @GeneratedValue with the GenerationType.AUTO option translates into a native identifier generation strategy, like the native option in XML Hibernate mappings. Note that if you don’t define a strategy, the default is also Generation-
Mapping entities with identity
165
Type.AUTO, so you could have omitted this attribute altogether. You also specify a
database column—otherwise Hibernate would use the property name. The mapping type is implied by the Java property type, java.lang.Long. Of course, you can also use direct field access for all properties, including the database identifier:
@Entity @Table(name="CATEGORY") public class Category { @Id @GeneratedValue @Column(name = "CATEGORY_ID") private Long id; ... public Long getId() { return this.id; } ... }
Mapping annotations are placed on the field declaration when direct field access is enabled, as defined by the standard. Whether field or property access is enabled for an entity depends on the position of the mandatory @Id annotation. In the preceding example, it’s present on a field, so all attributes of the class are accessed by Hibernate through fields. The example before that, annotated on the getId() method, enables access to all attributes through getter and setter methods. Alternatively, you can use JPA XML descriptors to create your identifier mapping:
...
In addition to operations for testing Java object identity, (a == b), and object equality, ( a.equals(b) ), you may now use a.getId().equals( b.getId() ) to test database identity. What do these notions have in common? In what situations do they all return true? The time when all are true is called the scope of
166
CHAPTER 4
Mapping persistent classes
guaranteed object identity; and we’ll come back to this subject in chapter 9, section 9.2, “Object identity and equality.” Using database identifiers in Hibernate is easy and straightforward. Choosing a good primary key (and key-generation strategy) may be more difficult. We discuss this issue next.
4.2.3
Database primary keys
Hibernate needs to know your preferred strategy for generating primary keys. First, though, let’s define primary key. Selecting a primary key The candidate key is a column or set of columns that could be used to identify a particular row in a table. To become a primary key, a candidate key must satisfy the following properties:
■ ■ ■
Its value (for any column of the candidate key) is never null. Each row has a unique value. The value of a particular row never changes.
If a table has only one identifying attribute, it’s, by definition, the primary key. However, several columns or combinations of columns may satisfy these properties for a particular table; you choose between candidate keys to decide the best primary key for the table. Candidate keys not chosen as the primary key should be declared as unique keys in the database. Many legacy SQL data models use natural primary keys. A natural key is a key with business meaning: an attribute or combination of attributes that is unique by virtue of its business semantics. Examples of natural keys are the U.S. Social Security Number and Australian Tax File Number. Distinguishing natural keys is simple: If a candidate key attribute has meaning outside the database context, it’s a natural key, whether or not it’s automatically generated. Think about the application users: If they refer to a key attribute when talking about and working with the application, it’s a natural key. Experience has shown that natural keys almost always cause problems in the long run. A good primary key must be unique, constant, and required (never null or unknown). Few entity attributes satisfy these requirements, and some that do can’t be efficiently indexed by SQL databases (although this is an implementation detail and shouldn’t be the primary motivation for or against a particular key). In
Mapping entities with identity
167
addition, you should make certain that a candidate key definition can never change throughout the lifetime of the database before making it a primary key. Changing the value (or even definition) of a primary key, and all foreign keys that refer to it, is a frustrating task. Furthermore, natural candidate keys can often be found only by combining several columns in a composite natural key. These composite keys, although certainly appropriate for some relations (like a link table in a many-to-many relationship), usually make maintenance, ad-hoc queries, and schema evolution much more difficult. For these reasons, we strongly recommend that you consider synthetic identifiers, also called surrogate keys. Surrogate keys have no business meaning—they’re unique values generated by the database or application. Application users ideally don’t see or refer to these key values; they’re part of the system internals. Introducing a surrogate key column is also appropriate in a common situation: If there are no candidate keys, a table is by definition not a relation as defined by the relational model—it permits duplicate rows—and so you have to add a surrogate key column. There are a number of well-known approaches to generating surrogate key values. Selecting a key generator Hibernate has several built-in identifier-generation strategies. We list the most useful options in table 4.1.
Table 4.1 Hibernate’s built-in identifier-generator modules JPA GenerationType Options – Description The native identity generator picks other identity generators like identity, sequence, or hilo, depending on the capabilities of the underlying database. Use this generator to keep your mapping metadata portable to different database management systems. This generator supports identity columns in DB2, MySQL, MS SQL Server, Sybase, and HypersonicSQL. The returned identifier is of type long, short, or int.
Generator name
native
AUTO
identity
IDENTITY
–
168
CHAPTER 4
Mapping persistent classes
Table 4.1 Hibernate’s built-in identifier-generator modules (continued) JPA GenerationType Options Description This generator creates a sequence in DB2, PostgreSQL, Oracle, SAP DB, or Mckoi; or a generator in InterBase is used. The returned identifier is of type long, short, or int. Use the sequence option to define a catalog name for the sequence (hibernate_ sequence is the default) and parameters if you need additional settings creating a sequence to be added to the DDL. At Hibernate startup, this generator reads the maximum (numeric) primary key column value of the table and increments the value by one each time a new row is inserted. The generated identifier is of type long, short, or int. This generator is especially efficient if the single-server Hibernate application has exclusive access to the database but should not be used in any other scenario. A high/low algorithm is an efficient way to generate identifiers of type long, given a table and column (by default hibernate_unique_key and next, respectively) as a source of high values. The high/low algorithm generates identifiers that are unique only for a particular database. High values are retrieved from a global source and are made unique by adding a local low value. This algorithm avoids congestion when a single source for identifier values has to be accessed for many inserts. See “Data Modeling 101” (Ambler, 2002) for more information about the high/low approach to unique identifiers. This generator needs to use a separate database connection from time to time to retrieve high values, so it isn’t supported with user-supplied database connections. In other words, don’t use it with
Generator name
sequence
SEQUENCE
sequence, parameters
increment
(Not available)
–
hilo
(Not available)
table, column, max_lo
sessionFactory.openSession(myCo nnection). The max_lo option defines
how many low values are added until a new high value is fetched. Only settings greater than 1 are sensible; the default is 32767 (Short.MAX_VALUE).
Mapping entities with identity
169
Table 4.1
Hibernate’s built-in identifier-generator modules (continued) JPA GenerationType (Not available) Options Description This generator works like the regular hilo generator, except it uses a named database sequence to generate high values. Much like Hibernate’s hilo strategy, TABLE relies on a database table that holds the lastgenerated integer primary key value, and each generator is mapped to one row in this table. Each row has two columns: pkColumnName and valueColumnName. The pkColumnValue assigns each row to a particular generator, and the value column holds the last retrieved primary key. The persistence provider allocates up to allocationSize integers in each turn. This generator is a 128-bit UUID (an algorithm that generates identifiers of type string, unique within a network). The IP address is used in combination with a unique timestamp. The UUID is encoded as a string of hexadecimal digits of length 32, with an optional separator string between each component of the UUID representation. Use this generator strategy only if you need globally unique identifiers, such as when you have to merge two databases regularly. This generator provides a database-generated globally unique identifier string on MySQL and SQL Server. This generator retrieves a primary key assigned by a database trigger by selecting the row by some unique key and retrieving the primary key value. An additional unique candidate key column is required for this strategy, and the key option has to be set to the name of the unique key column.
Generator name
seqhilo
sequence, parameters, max_lo table, catalog, schema, pkColumnName, valueColumnNam e, pkColumnValue, allocationSize
(JPA only)
TABLE
uuid.hex
(Not available)
separator
guid
(Not available) (Not available)
-
select
key
170
CHAPTER 4
Mapping persistent classes
Some of the built-in identifier generators can be configured with options. In a native Hibernate XML mapping, you define options as pairs of keys and values:
MY_SEQUENCE INCREMENT BY 1 START WITH 1
You can use Hibernate identifier generators with annotations, even if no direct annotation is available:
@Entity @org.hibernate.annotations.GenericGenerator( name = "hibernate-uuid", strategy = "uuid" ) class name MyEntity { @Id @GeneratedValue(generator = "hibernate-uuid") @Column(name = "MY_ID") String id; }
The @GenericGenerator Hibernate extension can be used to give a Hibernate identifier generator a name, in this case hibernate-uuid. This name is then referenced by the standardized generator attribute. This declaration of a generator and its assignment by name also must be applied for sequence- or table-based identifier generation with annotations. Imagine that you want to use a customized sequence generator in all your entity classes. Because this identifier generator has to be global, it’s declared in orm.xml:
This declares that a database sequence named MY_SEQUENCE with an initial value of 123 can be used as a source for database identifier generation, and that the persistence engine should obtain 20 values every time it needs identifiers. (Note, though, that Hibernate Annotations, at the time of writing, ignores the initialValue setting.) To apply this identifier generator for a particular entity, use its name:
Class mapping options
171
@Entity class name MyEntity { @Id @GeneratedValue(generator = "mySequenceGenerator") String id; }
If you declared another generator with the same name at the entity level, before the class keyword, it would override the global identifier generator. The same approach can be used to declare and apply a @TableGenerator. You aren’t limited to the built-in strategies; you can create your own identifier generator by implementing Hibernate’s IdentifierGenerator interface. As always, it’s a good strategy to look at the Hibernate source code of the existing identifier generators for inspiration. It’s even possible to mix identifier generators for persistent classes in a single domain model, but for nonlegacy data we recommend using the same identifier generation strategy for all entities. For legacy data and application-assigned identifiers, the picture is more complicated. In this case, we’re often stuck with natural keys and especially composite keys. A composite key is a natural key that is composed of multiple table columns. Because composite identifiers can be a bit more difficult to work with and often only appear on legacy schemas, we only discuss them in the context of chapter 8, section 8.1, “Integrating legacy databases.” We assume from now on that you’ve added identifier properties to the entity classes of your domain model, and that after you completed the basic mapping of each entity and its identifier property, you continued to map value-typed properties of the entities. However, some special options can simplify or enhance your class mappings.
4.3
Class mapping options
If you check the and elements in the DTD (or the reference documentation), you’ll find a few options we haven’t discussed so far:
■ ■ ■ ■ ■ ■
Dynamic generation of CRUD SQL statements Entity mutability control Naming of entities for querying Mapping package names Quoting keywords and reserved database identifiers Implementing database naming conventions
172
CHAPTER 4
Mapping persistent classes
4.3.1
Dynamic SQL generation
By default, Hibernate creates SQL statements for each persistent class on startup. These statements are simple create, read, update, and delete operations for reading a single row, deleting a row, and so on. How can Hibernate create an UPDATE statement on startup? After all, the columns to be updated aren’t known at this time. The answer is that the generated SQL statement updates all columns, and if the value of a particular column isn’t modified, the statement sets it to its old value. In some situations, such as a legacy table with hundreds of columns where the SQL statements will be large for even the simplest operations (say, only one column needs updating), you have to turn off this startup SQL generation and switch to dynamic statements generated at runtime. An extremely large number of entities can also impact startup time, because Hibernate has to generate all SQL statements for CRUD upfront. Memory consumption for this query statement cache will also be high if a dozen statements must be cached for thousands of entities (this isn’t an issue, usually). Two attributes for disabling CRUD SQL generation on startup are available on the mapping element:
...
The dynamic-insert attribute tells Hibernate whether to include null property values in an SQL INSERT, and the dynamic-update attribute tells Hibernate whether to include unmodified properties in the SQL UPDATE. If you’re using JDK 5.0 annotation mappings, you need a native Hibernate annotation to enable dynamic SQL generation:
@Entity @org.hibernate.annotations.Entity( dynamicInsert = true, dynamicUpdate = true ) public class Item { ...
The second @Entity annotation from the Hibernate package extends the JPA annotation with additional options, including dynamicInsert and dynamicUpdate. Sometimes you can avoid generating any UPDATE statement, if the persistent class is mapped immutable.
Class mapping options
173
4.3.2
Making an entity immutable
Instances of a particular class may be immutable. For example, in CaveatEmptor, a Bid made for an item is immutable. Hence, no UPDATE statement ever needs to be executed on the BID table. Hibernate can also make a few other optimizations, such as avoiding dirty checking, if you map an immutable class with the mutable attribute set to false:
...
A POJO is immutable if no public setter methods for any properties of the class are exposed—all values are set in the constructor. Instead of private setter methods, you often prefer direct field access by Hibernate for immutable persistent classes, so you don’t have to write useless accessor methods. You can map an immutable entity using annotations:
@Entity @org.hibernate.annotations.Entity(mutable = false) @org.hibernate.annotations.AccessType("field") public class Bid { ...
Again, the native Hibernate @Entity annotation extends the JPA annotation with additional options. We have also shown the Hibernate extension annotation @AccessType here—this is an annotation you’ll rarely use. As explained earlier, the default access strategy for a particular entity class is implicit from the position of the mandatory @Id property. However, you can use @AccessType to force a more fine-grained strategy; it can be placed on class declarations (as in the preceding example) or even on particular fields or accessor methods. Let’s have a quick look at another issue, the naming of entities for queries.
4.3.3
Naming entities for querying
By default, all class names are automatically “imported” into the namespace of the Hibernate query language, HQL. In other words, you can use the short class names without a package prefix in HQL, which is convenient. However, this autoimport can be turned off if two classes with the same name exist for a given SessionFactory, maybe in different packages of the domain model. If such a conflict exists, and you don’t change the default settings, Hibernate won’t know which class you’re referring to in HQL. You can turn off auto-import
174
CHAPTER 4
Mapping persistent classes
of names into the HQL namespace for particular mapping files with the autoimport="false" setting on the root element. Entity names can also be imported explicitly into the HQL namespace. You can even import classes and interfaces that aren’t explicitly mapped, so a short name can be used in polymorphic HQL queries:
You can now use an HQL query such as from IAuditable to retrieve all persistent instances of classes that implement the auction.model.Auditable interface. (Don’t worry if you don’t know whether this feature is relevant to you at this point; we’ll get back to queries later in the book.) Note that the element, like all other immediate child elements of , is an application-wide declaration, so you don’t have to (and can’t) duplicate this in other mapping files. With annotations, you can give an entity an explicit name, if the short name would result in a collision in the JPA QL or HQL namespace:
@Entity(name="AuctionItem") public class Item { ... }
Now let’s consider another aspect of naming: the declaration of packages.
4.3.4
Declaring a package name
All the persistent classes of the CaveatEmptor application are declared in the Java package auction.model. However, you don’t want to repeat the full package name whenever this or any other class is named in an association, subclass, or component mapping. Instead, specify a package attribute:
...
Now all unqualified class names that appear in this mapping document will be prefixed with the declared package name. We assume this setting in all mapping examples in this book and use unqualified names for CaveatEmptor model classes. Names of classes and tables must be selected carefully. However, a name you’ve chosen may be reserved by the SQL database system, so the name has to be quoted.
Class mapping options
175
4.3.5
Quoting SQL identifiers
By default, Hibernate doesn’t quote table and column names in the generated SQL. This makes the SQL slightly more readable, and it also allows you to take advantage of the fact that most SQL databases are case insensitive when comparing unquoted identifiers. From time to time, especially in legacy databases, you encounter identifiers with strange characters or whitespace, or you wish to force case sensitivity. Or, if you rely on Hibernate’s defaults, a class or property name in Java may be automatically translated to a table or column name that isn’t allowed in your database management system. For example, the User class is mapped to a USER table, which is usually a reserved keyword in SQL databases. Hibernate doesn’t know the SQL keywords of any DBMS product, so the database system throws an exception at startup or runtime. If you quote a table or column name with backticks in the mapping document, Hibernate always quotes this identifier in the generated SQL. The following property declaration forces Hibernate to generate SQL with the quoted column name "DESCRIPTION". Hibernate also knows that Microsoft SQL Server needs the variation [DESCRIPTION] and that MySQL requires `DESCRIPTION`.
There is no way, apart from quoting all table and column names in backticks, to force Hibernate to use quoted identifiers everywhere. You should consider renaming tables or columns with reserved keyword names whenever possible. Quoting with backticks works with annotation mappings, but it’s an implementation detail of Hibernate and not part of the JPA specification.
4.3.6
Implementing naming conventions
We often encounter organizations with strict conventions for database table and column names. Hibernate provides a feature that allows you to enforce naming standards automatically. Suppose that all table names in CaveatEmptor should follow the pattern CE_. One solution is to manually specify a table attribute on all and collection elements in the mapping files. However, this approach is time-consuming and easily forgotten. Instead, you can implement Hibernate’s NamingStrategy interface, as in listing 4.1.
176
CHAPTER 4
Mapping persistent classes
Listing 4.1 NamingStrategy implementation
public class CENamingStrategy extends ImprovedNamingStrategy { public String classToTableName(String className) { return StringHelper.unqualify(className); } public String propertyToColumnName(String propertyName) { return propertyName; } public String tableName(String tableName) { return "CE_" + tableName; } public String columnName(String columnName) { return columnName; } public String propertyToTableName(String className, String propertyName) { return "CE_" + classToTableName(className) + '_' + propertyToColumnName(propertyName); } }
You extend the ImprovedNamingStrategy, which provides default implementations for all methods of NamingStrategy you don’t want to implement from scratch (look at the API documentation and source). The classToTableName() method is called only if a mapping doesn’t specify an explicit table name. The propertyToColumnName() method is called if a property has no explicit column name. The tableName() and columnName() methods are called when an explicit name is declared. If you enable this CENamingStrategy, the class mapping declaration
results in CE_BANKACCOUNT as the name of the table. However, if a table name is specified, like this,
then CE_BANK_ACCOUNT is the name of the table. In this case, BANK_ACCOUNT is passed to the tableName() method.
Fine-grained models and mappings
177
The best feature of the NamingStrategy interface is the potential for dynamic behavior. To activate a specific naming strategy, you can pass an instance to the Hibernate Configuration at startup:
Configuration cfg = new Configuration(); cfg.setNamingStrategy( new CENamingStrategy() ); SessionFactory sessionFactory sf = cfg.configure().buildSessionFactory();
This allows you to have multiple SessionFactory instances based on the same mapping documents, each using a different NamingStrategy. This is extremely useful in a multiclient installation, where unique table names (but the same data model) are required for each client. However, a better way to handle this kind of requirement is to use an SQL schema (a kind of namespace), as already discussed in chapter 3, section 3.3.4, “Handling global metadata.” You can set a naming strategy implementation in Java Persistence in your persistence.xml file with the hibernate.ejb.naming_strategy option. Now that we have covered the concepts and most important mappings for entities, let’s map value types.
4.4
Fine-grained models and mappings
After spending the first half of this chapter almost exclusively on entities and the respective basic persistent class-mapping options, we’ll now focus on value types in their various forms. Two different kinds come to mind immediately: value-typed classes that came with the JDK, such as String or primitives, and value-typed classes defined by the application developer, such as Address and MonetaryAmount. First, you map persistent class properties that use JDK types and learn the basic mapping elements and attributes. Then you attack custom value-typed classes and map them as embeddable components.
4.4.1
Mapping basic properties
If you map a persistent class, no matter whether it’s an entity or a value type, all persistent properties have to be mapped explicitly in the XML mapping file. On the other hand, if a class is mapped with annotations, all of its properties are considered persistent by default. You can mark properties with the @javax.persistence.Transient annotation to exclude them, or use the transient Java keyword (which usually only excludes fields for Java serialization). In a JPA XML descriptor, you can exclude a particular field or property:
178
CHAPTER 4
Mapping persistent classes
...
A typical Hibernate property mapping defines a POJO’s property name, a database column name, and the name of a Hibernate type, and it’s often possible to omit the type. So, if description is a property of (Java) type java.lang.String, Hibernate uses the Hibernate type string by default (we come back to the Hibernate type system in the next chapter). Hibernate uses reflection to determine the Java type of the property. Thus, the following mappings are equivalent:
It’s even possible to omit the column name if it’s the same as the property name, ignoring case. (This is one of the sensible defaults we mentioned earlier.) For some more unusual cases, which you’ll see more about later, you may need to use a element instead of the column attribute in your XML mapping. The element provides more flexibility: It has more optional attributes and may appear more than once. (A single property can map to more than one column, a technique we discuss in the next chapter.) The following two property mappings are equivalent:
The element (and especially the element) also defines certain attributes that apply mainly to automatic database schema generation. If you aren’t using the hbm2ddl tool (see chapter 2, section 2.1.4, “Running and testing the application”) to generate the database schema, you may safely omit these. However, it’s preferable to include at least the not-null attribute, because Hibernate can then report illegal null property values without going to the database:
JPA is based on a configuration by exception model, so you could rely on defaults.
If a property of a persistent class isn’t annotated, the following rules apply:
Fine-grained models and mappings
179
■
If the property is of a JDK type, it’s automatically persistent. In other words, it’s handled like in a Hibernate XML mapping file. Otherwise, if the class of the property is annotated as @Embeddable, it’s mapped as a component of the owning class. We’ll discuss embedding of components later in this chapter. Otherwise, if the type of the property is Serializable, its value is stored in its serialized form. This usually isn’t what you want, and you should always map Java classes instead of storing a heap of bytes in the database. Imagine maintaining a database with this binary information when the application is gone in a few years.
■
■
If you don’t want to rely on these defaults, apply the @Basic annotation on a particular property. The @Column annotation is the equivalent of the XML element. Here is an example of how you declare a property’s value as required:
@Basic(optional = false) @Column(nullable = false) public BigDecimal getInitialPrice { return initialPrice; }
The @Basic annotation marks the property as not optional on the Java object level. The second setting, nullable = false on the column mapping, is only responsible for the generation of a NOT NULL database constraint. The Hibernate JPA implementation treats both options the same way in any case, so you may as well use only one of the annotations for this purpose. In a JPA XML descriptor, this mapping looks the same:
...
Quite a few options in Hibernate metadata are available to declare schema constraints, such as NOT NULL on a column. Except for simple nullability, however, they’re only used to produce DDL when Hibernate exports a database schema from mapping metadata. We’ll discuss customization of SQL, including DDL, in chapter 8, section 8.3, “Improving schema DDL.” On the other hand, the Hibernate Annotations package includes a more advanced and sophisticated data validation framework, which you can use not only to define database schema
180
CHAPTER 4
Mapping persistent classes
constraints in DDL, but also for data validation at runtime. We’ll discuss it in chapter 17. Are annotations for properties always on the accessor methods? Customizing property access Properties of a class are accessed by the persistence engine either directly (through fields) or indirectly (through getter and setter property accessor methods). In XML mapping files, you control the default access strategy for a class with the default-access="field|property|noop|custom.Class" attribute of the hibernate-mapping root element. An annotated entity inherits the default from the position of the mandatory @Id annotation. For example, if @Id has been declared on a field, not a getter method, all other property mapping annotations, like the name of the column for the item’s description property, are also declared on fields:
@Column(name = "ITEM_DESCR") private String description; public String getDescription() { return description; }
This is the default behavior as defined by the JPA specification. However, Hibernate allows flexible customization of the access strategy with the @org.hibernate.annotations.AccessType() annotation:
■
If AccessType is set on the class/entity level, all attributes of the class are accessed according to the selected strategy. Attribute-level annotations are expected on either fields or getter methods, depending on the strategy. This setting overrides any defaults from the position of the standard @Id annotations. If an entity defaults or is explicitly set for field access, the AccessType("property") annotation on a field switches this particular attribute to runtime access through property getter/setter methods. The position of the AccessType annotation is still the field. If an entity defaults or is explicitly set for property access, the AccessType("field") annotation on a getter method switches this particular attribute to runtime access through a field of the same name. The position of the AccessType annotation is still the getter method. Any @Embedded class inherits the default or explicitly declared access strategy of the owning root entity class. Any @MappedSuperclass properties are accessed with the default or explicitly declared access strategy of the mapped entity class.
■
■
■
■
Fine-grained models and mappings
181
You can also control access strategies on the property level in Hibernate XML mappings with the access attribute:
Or, you can set the access strategy for all class mappings inside a root element with the default-access attribute. Another strategy besides field and property access that can be useful is noop. It maps a property that doesn’t exist in the Java persistent class. This sounds strange, but it lets you refer to this “virtual” property in HQL queries (in other words, to use the database column in HQL queries only). If none of the built-in access strategies are appropriate, you can define your own customized property-access strategy by implementing the interface org.hibernate.property.PropertyAccessor. Set the (fully qualified) class name on the access mapping attribute or @AccessType annotation. Have a look at the Hibernate source code for inspiration; it’s a straightforward exercise. Some properties don’t map to a column at all. In particular, a derived property takes its value from an SQL expression. Using derived properties The value of a derived property is calculated at runtime by evaluating an expression that you define using the formula attribute. For example, you may map a totalIncludingTax property to an SQL expression:
The given SQL formula is evaluated every time the entity is retrieved from the database (and not at any other time, so the result may be outdated if other properties are modified). The property doesn’t have a column attribute (or subelement) and never appears in an SQL INSERT or UPDATE, only in SELECTs. Formulas may refer to columns of the database table, they can call SQL functions, and they may even include SQL subselects. The SQL expression is passed to the underlying database as is; this is a good chance to bind your mapping file to a particular database product, if you aren’t careful and rely on vendor-specific operators or keywords. Formulas are also available with a Hibernate annotation:
@org.hibernate.annotations.Formula("TOTAL + TAX_RATE * TOTAL") public BigDecimal getTotalIncludingTax() {
182
CHAPTER 4
Mapping persistent classes
return totalIncludingTax; }
The following example uses a correlated subselect to calculate the average amount of all bids for an item:
Notice that unqualified column names refer to columns of the table of the class to which the derived property belongs. Another special kind of property relies on database-generated values. Generated and default property values Imagine a particular property of a class has its value generated by the database, usually when the entity row is inserted for the first time. Typical database-generated values are timestamp of creation, a default price for an item, and a trigger that runs for every modification. Typically, Hibernate applications need to refresh objects that contain any properties for which the database generates values. Marking properties as generated, however, lets the application delegate this responsibility to Hibernate. Essentially, whenever Hibernate issues an SQL INSERT or UPDATE for an entity that has defined generated properties, it immediately does a SELECT afterwards to retrieve the generated values. Use the generated switch on a property mapping to enable this automatic refresh:
Properties marked as database-generated must additionally be noninsertable and nonupdateable, which you control with the insert and update attributes. If both are set to false, the property’s columns never appear in the INSERT or UPDATE statements—the property value is read-only. Also, you usually don’t add a public setter method in your class for an immutable property (and switch to field access). With annotations, declare immutability (and automatic refresh) with the @Generated Hibernate annotation:
Fine-grained models and mappings
183
@Column(updatable = false, insertable = false) @org.hibernate.annotations.Generated( org.hibernate.annotations.GenerationTime.ALWAYS ) private Date lastModified;
The settings available are GenerationTime.ALWAYS and GenerationTime.INSERT, and the equivalent options in XML mappings are generated="always" and generated="insert". A special case of database-generated property values are default values. For example, you may want to implement a rule that every auction item costs at least $1. First, you’d add this to your database catalog as the default value for the INITIAL_PRICE column:
create table ITEM ( ... INITIAL_PRICE number(10,2) default '1', ... );
If you use Hibernate’s schema export tool, hbm2ddl, you can enable this output by adding a default attribute to the property mapping:
... ...
Note that you also have to enable dynamic insertion and update statement generation, so that the column with the default value isn’t included in every statement if its value is null (otherwise a NULL would be inserted instead of the default value). Furthermore, an instance of Item that has been made persistent but not yet flushed to the database and not refreshed again won’t have the default value set on the object property. In other words, you need to execute an explicit flush:
Item newItem = new Item(...); session.save(newItem); newItem.getInitialPrice(); // is null
session.flush(); // Trigger an INSERT // Hibernate does a SELECT automatically newItem.getInitialPrice(); // is $1
184
CHAPTER 4
Mapping persistent classes
Because you set generated="insert", Hibernate knows that an immediate additional SELECT is required to read the database-generated property value. You can map default column values with annotations as part of the DDL definition for a column:
@Column(name = "INITIAL_PRICE", columnDefinition = "number(10,2) default '1'") @org.hibernate.annotations.Generated( org.hibernate.annotations.GenerationTime.INSERT ) private BigDecimal initalPrice;
The columnDefinition attribute includes the complete properties for the column DDL, with datatype and all constraints. Keep in mind that an actual nonportable SQL datatype may bind your annotation mapping to a particular database management system. We’ll come back to the topic of constraints and DDL customization in chapter 8, section 8.3, “Improving schema DDL.” Next, you’ll map user-defined value-typed classes. You can easily spot them in your UML class diagrams if you search for a composition relationship between two classes. One of them is a dependent class, a component.
4.4.2
Mapping components
So far, the classes of the object model have all been entity classes, each with its own lifecycle and identity. The User class, however, has a special kind of association with the Address class, as shown in figure 4.2. In object-modeling terms, this association is a kind of aggregation—a part-of relationship. Aggregation is a strong form of association; it has some additional semantics with regard to the lifecycle of objects. In this case, you have an even stronger form, composition, where the lifecycle of the part is fully dependent upon the lifecycle of the whole. Object modeling experts and UML designers claim that there is no difference between this composition and other weaker styles of association when it comes to the actual Java implementation. But in the context of ORM, there is a big difference: A composed class is often a candidate value type.
firstname : String lastname : String username : String password : String email : String ranking : int admin : boolean
home billing
street : String zipcode : String city : String
Figure 4.2 Relationships between User and Address using composition
Fine-grained models and mappings
185
You map Address as a value type and User as an entity. Does this affect the implementation of the POJO classes? Java has no concept of composition—a class or attribute can’t be marked as a component or composition. The only difference is the object identifier: A component has no individual identity, hence the persistent component class requires no identifier property or identifier mapping. It’s a simple POJO:
public class Address { private String street; private String zipcode; private String city; public Address() {} public String getStreet() { return street; } public void setStreet(String street) { this.street = street; } public String getZipcode() { return zipcode; } public void setZipcode(String zipcode) { this.zipcode = zipcode; } public String getCity() { return city; } public void setCity(String city) { this.city = city; } }
The composition between User and Address is a metadata-level notion; you only have to tell Hibernate that the Address is a value type in the mapping document or with annotations. Component mapping in XML Hibernate uses the term component for a user-defined class that is persisted to the same table as the owning entity, an example of which is shown in listing 4.2. (The use of the word component here has nothing to do with the architecturelevel concept, as in software component.)
Listing 4.2 Mapping of the User class with a component Address
B
186
CHAPTER 4
Mapping persistent classes
C
...
B C
You declare the persistent attributes of Address inside the element. The property of the User class is named homeAddress. You reuse the same component class to map another property of this type to the same table. Figure 4.3 shows how the attributes of the Address class are persisted to the same table as the User entity. Notice that, in this example, you model the composition association as unidirectional. You can’t navigate from Address to User. Hibernate supports both unidirectional and bidirectional compositions, but unidirectional composition is far more common. An example of a bidirectional mapping is shown in listing 4.3.
Listing 4.3 Adding a back-pointer to a composition
Figure 4.3 Table attributes of User with Address component
Fine-grained models and mappings
187
In listing 4.3, the element maps a property of type User to the owning entity, which in this example is the property named user. You can then call Address.getUser() to navigate in the other direction. This is really a simple back-pointer. A Hibernate component can own other components and even associations to other entities. This flexibility is the foundation of Hibernate’s support for finegrained object models. For example, you can create a Location class with detailed information about the home address of an Address owner:
The design of the Location class is equivalent to the Address class. You now have three classes, one entity, and two value types, all mapped to the same table. Now let’s map components with JPA annotations. Annotating embedded classes The Java Persistence specification calls components embedded classes. To map an embedded class with annotations, you can declare a particular property in the owning entity class as @Embedded, in this case the homeAddress of User:
@Entity @Table(name = "USERS") public class User { ... @Embedded private Address homeAddress; ... }
If you don’t declare a property as @Embedded, and it isn’t of a JDK type, Hibernate looks into the associated class for the @Embeddable annotation. If it’s present, the property is automatically mapped as a dependent component.
188
CHAPTER 4
Mapping persistent classes
This is what the embeddable class looks like:
@Embeddable public class Address { @Column(name = "ADDRESS_STREET", nullable = false) private String street; @Column(name = "ADDRESS_ZIPCODE", nullable = false) private String zipcode; @Column(name = "ADDRESS_CITY", nullable = false) private String city; ... }
You can further customize the individual property mappings in the embeddable class, such as with the @Column annotation. The USERS table now contains, among others, the columns ADDRESS_STREET, ADDRESS_ZIPCODE, and ADDRESS_CITY. Any other entity table that contains component fields (say, an Order class that also has an Address) uses the same column options. You can also add a back-pointer property to the Address embeddable class and map it with @org.hibernate.annotations.Parent. Sometimes you’ll want to override the settings you made inside the embeddable class from outside for a particular entity. For example, here is how you can rename the columns:
@Entity @Table(name = "USERS") public class User { ... @Embedded @AttributeOverrides( { @AttributeOverride(name column @AttributeOverride(name column @AttributeOverride(name column }) private Address homeAddress; ... }
= = = = = =
"street", @Column(name="HOME_STREET") ), "zipcode", @Column(name="HOME_ZIPCODE") ), "city", @Column(name="HOME_CITY") )
Summary
189
The new @Column declarations in the User class override the settings of the embeddable class. Note that all attributes on the embedded @Column annotation are replaced, so they’re no longer nullable = false. In a JPA XML descriptor, a mapping of an embeddable class and a composition looks like the following:
...
There are two important limitations to classes mapped as components. First, shared references, as for all value types, aren’t possible. The component homeAddress doesn’t have its own database identity (primary key) and so can’t be referred to by any object other than the containing instance of User. Second, there is no elegant way to represent a null reference to an Address. In lieu of any elegant approach, Hibernate represents a null component as null values in all mapped columns of the component. This means that if you store a component object with all null property values, Hibernate returns a null component when the owning entity object is retrieved from the database. You’ll find many more component mappings (even collections of them) throughout the book.
4.5
Summary
In this chapter, you learned the essential distinction between entities and value types and how these concepts influence the implementation of your domain model as persistent Java classes. Entities are the coarser-grained classes of your system. Their instances have an independent lifecycle and their own identity, and they can be referenced by many
190
CHAPTER 4
Mapping persistent classes
other instances. Value types, on the other hand, are dependent on a particular entity class. An instance of a value type has a lifecycle bound by its owning entity instance, and it can be referenced by only one entity—it has no individual identity. We looked at Java identity, object equality, and database identity, and at what makes good primary keys. You learned which generators for primary key values are built into Hibernate, and how you can use and extend this identifier system. You also learned various (mostly optional) class mapping options and, finally, how basic properties and value-type components are mapped in XML mappings and annotations. For convenience, table 4.2 summarizes the differences between Hibernate and Java Persistence related to concepts discussed in this chapter.
Table 4.2 Hibernate and JPA comparison chart for chapter 4 Hibernate Core Entity- and value-typed classes are the essential concepts for the support of rich and fine-grained domain models. Java Persistence and EJB 3.0 The JPA specification makes the same distinction, but calls value types “embeddable classes.” However, nested embeddable classes are considered a nonportable feature. JPA standardizes a subset of 4 identifier generators, but allows vendor extension. JPA standardizes property access through fields or access methods, and strategies can’t be mixed for a particular class without Hibernate extension annotations. JPA doesn’t include these features, a Hibernate extension is needed.
Hibernate supports 10 identifier generation strategies out-of-the-box. Hibernate can access properties through fields, accessor methods, or with any custom PropertyAccessor implementation. Strategies can be mixed for a particular class. Hibernate supports formula properties and database-generated values.
In the next chapter, we’ll attack inheritance and how hierarchies of entity classes can be mapped with various strategies. We’ll also talk about the Hibernate mapping type system, the converters for value types we’ve shown in a few examples.
Inheritance and custom types
This chapter covers
■ ■ ■
Inheritance mapping strategies The Hibernate mapping type system Customization of mapping types
191
192
CHAPTER 5
Inheritance and custom types
We deliberately didn’t talk much about inheritance mapping so far. Mapping a hierarchy of classes to tables can be a complex issue, and we’ll present various strategies in this chapter. You’ll learn which strategy to choose in a particular scenario. The Hibernate type system, with all its built-in converters and transformers for Java value-typed properties to SQL datatypes, is the second big topic we discuss in this chapter. Let’s start with the mapping of entity inheritance.
5.1
Mapping class inheritance
A simple strategy for mapping classes to database tables might be “one table for every entity persistent class.” This approach sounds simple enough and, indeed, works well until we encounter inheritance. Inheritance is such a visible structural mismatch between the object-oriented and relational worlds because object-oriented systems model both is a and has a relationships. SQL-based models provide only has a relationships between entities; SQL database management systems don’t support type inheritance—and even when it’s available, it’s usually proprietary or incomplete. There are four different approaches to representing an inheritance hierarchy:
■
Table per concrete class with implicit polymorphism—Use no explicit inheritance mapping, and default runtime polymorphic behavior. Table per concrete class—Discard polymorphism and inheritance relationships completely from the SQL schema. Table per class hierarchy—Enable polymorphism by denormalizing the SQL schema, and utilize a type discriminator column that holds type information. Table per subclass—Represent is a (inheritance) relationships as has a (foreign key) relationships.
■
■
■
This section takes a top-down approach; it assumes that you’re starting with a domain model and trying to derive a new SQL schema. However, the mapping strategies described are just as relevant if you’re working bottom up, starting with existing database tables. We’ll show some tricks along the way that help you dealing with nonperfect table layouts.
5.1.1
Table per concrete class with implicit polymorphism
Suppose we stick with the simplest approach suggested. You can use exactly one table for each (nonabstract) class. All properties of a class, including inherited properties, can be mapped to columns of this table, as shown in figure 5.1.
Mapping class inheritance
193
Figure 5.1
Mapping all concrete classes to an independent table
You don’t have to do anything special in Hibernate to enable polymorphic behavior. The mapping for CreditCard and BankAccount is straightforward, each in its own entity element, as we have done already for classes without a superclass (or persistent interfaces). Hibernate still knows about the superclass (or any interface) because it scans the persistent classes on startup. The main problem with this approach is that it doesn’t support polymorphic associations very well. In the database, associations are usually represented as foreign key relationships. In figure 5.1, if the subclasses are all mapped to different tables, a polymorphic association to their superclass (abstract BillingDetails in this example) can’t be represented as a simple foreign key relationship. This would be problematic in our domain model, because BillingDetails is associated with User; both subclass tables would need a foreign key reference to the USERS table. Or, if User had a many-to-one relationship with BillingDetails, the USERS table would need a single foreign key column, which would have to refer both concrete subclass tables. This isn’t possible with regular foreign key constraints. Polymorphic queries (queries that return objects of all classes that match the interface of the queried class) are also problematic. A query against the superclass must be executed as several SQL SELECTs, one for each concrete subclass. For a query against the BillingDetails class Hibernate uses the following SQL:
select CREDIT_CARD_ID, OWNER, NUMBER, EXP_MONTH, EXP_YEAR ... from CREDIT_CARD select BANK_ACCOUNT_ID, OWNER, ACCOUNT, BANKNAME, ... from BANK_ACCOUNT
Notice that a separate query is needed for each concrete subclass. On the other hand, queries against the concrete classes are trivial and perform well—only one of the statements is needed.
194
CHAPTER 5
Inheritance and custom types
(Also note that here, and in other places in this book, we show SQL that is conceptually identical to the SQL executed by Hibernate. The actual SQL may look superficially different.) A further conceptual problem with this mapping strategy is that several different columns, of different tables, share exactly the same semantics. This makes schema evolution more complex. For example, a change to a superclass property results in changes to multiple columns. It also makes it much more difficult to implement database integrity constraints that apply to all subclasses. We recommend this approach (only) for the top level of your class hierarchy, where polymorphism isn’t usually required, and when modification of the superclass in the future is unlikely. Also, the Java Persistence interfaces don’t support full polymorphic queries; only mapped entities (@Entity) can be officially part of a Java Persistence query (note that the Hibernate query interfaces are polymorphic, even if you map with annotations). If you’re relying on this implicit polymorphism, you map concrete classes with @Entity, as usual. However, you also have to duplicate the properties of the superclass to map them to all concrete class tables. By default, properties of the superclass are ignored and not persistent! You need to annotate the superclass to enable embedding of its properties in the concrete subclass tables:
@MappedSuperclass public abstract class BillingDetails { @Column(name = "OWNER", nullable = false) private String owner; ... }
Now map the concrete subclasses:
@Entity @AttributeOverride(name = "owner", column = @Column(name = "CC_OWNER", nullable = false) ) public class CreditCard extends BillingDetails { @Id @GeneratedValue @Column(name = "CREDIT_CARD_ID") private Long id = null; @Column(name = "NUMBER", nullable = false) private String number; ... }
Mapping class inheritance
195
You can override column mappings from the superclass in a subclass with the @AttributeOverride annotation. You rename the OWNER column to CC_OWNER in the CREDIT_CARD table. The database identifier can also be declared in the superclass, with a shared column name and generator strategy for all subclasses. Let’s repeat the same mapping in a JPA XML descriptor:
... ... ...
NOTE
A component is a value type; hence, the normal entity inheritance rules presented in this chapter don’t apply. However, you can map a subclass as a component by including all the properties of the superclass (or interface) in your component mapping. With annotations, you use the @MappedSuperclass annotation on the superclass of the embeddable component you’re mapping just like you would for an entity. Note that this feature is available only in Hibernate Annotations and isn’t standardized or portable.
With the help of the SQL UNION operation, you can eliminate most of the issues with polymorphic queries and associations, which are present with this mapping strategy.
5.1.2
Table per concrete class with unions
First, let’s consider a union subclass mapping with BillingDetails as an abstract class (or interface), as in the previous section. In this situation, we again have two tables and duplicate superclass columns in both: CREDIT_CARD and BANK_ACCOUNT. What’s new is a special Hibernate mapping that includes the superclass, as you can see in listing 5.1.
196
CHAPTER 5
Inheritance and custom types
Listing 5.1 Using the inheritance strategy
B
C
D
...
E
...
B C
An abstract superclass or an interface has to be declared as abstract="true"; otherwise a separate table for instances of the superclass is needed. The database identifier mapping is shared for all concrete classes in the hierarchy. The CREDIT_CARD and the BANK_ACCOUNT tables both have a BILLING_DETAILS_ID primary key column. The database identifier property now has to be shared for all subclasses; hence you have to move it into BillingDetails and remove it from CreditCard and BankAccount. Properties of the superclass (or interface) are declared here and inherited by all concrete class mappings. This avoids duplication of the same mapping. A concrete subclass is mapped to a table; the table inherits the superclass (or interface) identifier and other property mappings.
D E
Mapping class inheritance
197
The first advantage you may notice with this strategy is the shared declaration of superclass (or interface) properties. No longer do you have to duplicate these mappings for all concrete classes—Hibernate takes care of this. Keep in mind that the SQL schema still isn’t aware of the inheritance; effectively, we’ve mapped two unrelated tables to a more expressive class structure. Except for the different primary key column name, the tables look exactly alike, as shown in figure 5.1. In JPA annotations, this strategy is known as TABLE_PER_CLASS:
@Entity @Inheritance(strategy = InheritanceType.TABLE_PER_CLASS) public abstract class BillingDetails { @Id @GeneratedValue @Column(name = "BILLING_DETAILS_ID") private Long id = null; @Column(name = "OWNER", nullable = false) private String owner; ... }
The database identifier and its mapping have to be present in the superclass, to be shared across all subclasses and their tables. An @Entity annotation on each subclass is all that is required:
@Entity @Table(name = "CREDIT_CARD") public class CreditCard extends BillingDetails { @Column(name = "NUMBER", nullable = false) private String number; ... }
Note that TABLE_PER_CLASS is specified in the JPA standard as optional, so not all JPA implementations may support it. The actual implementation is also vendor dependent—in Hibernate, it’s equivalent to a mapping in XML files. The same mapping looks like this in a JPA XML descriptor:
...
198
CHAPTER 5
Inheritance and custom types
If your superclass is concrete, then an additional table is needed to hold instances of that class. We have to emphasize again that there is still no relationship between the database tables, except for the fact that they share some similar columns. The advantages of this mapping strategy are clearer if we examine polymorphic queries. For example, a query for BillingDetails executes the following SQL statement:
select BILLING_DETAILS_ID, OWNER, NUMBER, EXP_MONTH, EXP_YEAR, ACCOUNT, BANKNAME, SWIFT CLAZZ_ from ( select BILLING_DETAILS_ID, OWNER, NUMBER, EXP_MONTH, EXP_YEAR, null as ACCOUNT, null as BANKNAME, null as SWIFT, 1 as CLAZZ_ from CREDIT_CARD union select BILLING_DETAILS_ID, OWNER, null as NUMBER, null as EXP_MONTH, null as EXP_YEAR, ... ACCOUNT, BANKNAME, SWIFT, 2 as CLAZZ_ from BANK_ACCOUNT )
This SELECT uses a FROM-clause subquery to retrieve all instances of BillingDetails from all concrete class tables. The tables are combined with a UNION operator, and a literal (in this case, 1 and 2) is inserted into the intermediate result; Hibernate reads this to instantiate the correct class given the data from a particular row. A union requires that the queries that are combined project over the same columns; hence, we have to pad and fill up nonexistent columns with NULL. You may ask whether this query will really perform better than two separate statements. Here we can let the database optimizer find the best execution plan to combine rows from several tables, instead of merging two result sets in memory as Hibernate’s polymorphic loader engine would do.
Mapping class inheritance
199
Another much more important advantage is the ability to handle polymorphic associations; for example, an association mapping from User to BillingDetails would now be possible. Hibernate can use a UNION query to simulate a single table as the target of the association mapping. We cover this topic in detail in chapter 7, section 7.3, “Polymorphic associations.” So far, the inheritance mapping strategies we’ve discussed don’t require extra consideration with regard to the SQL schema. No foreign keys are needed, and relations are properly normalized. This situation changes with the next strategy.
5.1.3
Table per class hierarchy
An entire class hierarchy can be mapped to a single table. This table includes columns for all properties of all classes in the hierarchy. The concrete subclass represented by a particular row is identified by the value of a type discriminator column. This approach is shown in figure 5.2. This mapping strategy is a winner in terms of both performance and simplicity. It’s the best-performing way to represent polymorphism—both polymorphic and nonpolymorphic queries perform well—and it’s even easy to implement by hand. Ad-hoc reporting is possible without complex joins or unions. Schema evolution is straightforward.
Figure 5.2 Mapping a whole class hierarchy to a single table
200
CHAPTER 5
Inheritance and custom types
There is one major problem: Columns for properties declared by subclasses must be declared to be nullable. If your subclasses each define several nonnullable properties, the loss of NOT NULL constraints may be a serious problem from the point of view of data integrity. Another important issue is normalization. We’ve created functional dependencies between nonkey columns, violating the third normal form. As always, denormalization for performance can be misleading, because it sacrifices long-term stability, maintainability, and the integrity of data for immediate gains that may be also achieved by proper optimization of the SQL execution plans (in other words, ask your DBA). In Hibernate, you use the element to create a table per class hierarchy mapping, as in listing 5.2.
Listing 5.2 Hibernate mapping
B
C
D
...
E
Mapping class inheritance
201
...
B C
The root class BillingDetails of the inheritance hierarchy is mapped to the table BILLING_DETAILS. You have to add a special column to distinguish between persistent classes: the discriminator. This isn’t a property of the persistent class; it’s used internally by Hibernate. The column name is BILLING_DETAILS_TYPE, and the values are strings—in this case, “CC” or “BA”. Hibernate automatically sets and retrieves the discriminator values. Properties of the superclass are mapped as always, with a simple element. Every subclass has its own element. Properties of a subclass are mapped to columns in the BILLING_DETAILS table. Remember that NOT NULL constraints aren’t allowed, because a BankAccount instance won’t have an expMonth property, and the CC_EXP_MONTH field must be NULL for that row. The element can in turn contain other nested elements, until the whole hierarchy is mapped to the table. Hibernate generates the following SQL when querying the BillingDetails class:
select BILLING_DETAILS_ID, BILLING_DETAILS_TYPE, OWNER, CC_NUMBER, CC_EXP_MONTH, ..., BA_ACCOUNT, BA_BANKNAME, ... from BILLING_DETAILS
D E
To query the CreditCard subclass, Hibernate adds a restriction on the discriminator column:
select BILLING_DETAILS_ID, OWNER, CC_NUMBER, CC_EXP_MONTH, ... from BILLING_DETAILS where BILLING_DETAILS_TYPE='CC'
This mapping strategy is also available in JPA, as SINGLE_TABLE:
@Entity @Inheritance(strategy = InheritanceType.SINGLE_TABLE) @DiscriminatorColumn( name = "BILLING_DETAILS_TYPE", discriminatorType = DiscriminatorType.STRING )
202
CHAPTER 5
Inheritance and custom types
public abstract class BillingDetails { @Id @GeneratedValue @Column(name = "BILLING_DETAILS_ID") private Long id = null; @Column(name = "OWNER", nullable = false) private String owner; ... }
If you don’t specify a discriminator column in the superclass, its name defaults to DTYPE and its type to string. All concrete classes in the inheritance hierarchy can have a discriminator value; in this case, BillingDetails is abstract, and CreditCard is a concrete class:
@Entity @DiscriminatorValue("CC") public class CreditCard extends BillingDetails { @Column(name = "CC_NUMBER") private String number; ... }
Without an explicit discriminator value, Hibernate defaults to the fully qualified class name if you use Hibernate XML files and the entity name if you use annotations or JPA XML files. Note that no default is specified in Java Persistence for nonstring discriminator types; each persistence provider can have different defaults. This is the equivalent mapping in JPA XML descriptors:
... CC ...
Sometimes, especially in legacy schemas, you don’t have the freedom to include an extra discriminator column in your entity tables. In this case, you can apply a formula to calculate a discriminator value for each row:
Mapping class inheritance
203
... ...
This mapping relies on an SQL CASE/WHEN expression to determine whether a particular row represents a credit card or a bank account (many developers never used this kind of SQL expression; check the ANSI standard if you aren’t familiar with it). The result of the expression is a literal, CC or BA, which in turn is declared on the mappings. Formulas for discrimination aren’t part of the JPA specification. However, you can apply a Hibernate annotation:
@Entity @Inheritance(strategy = InheritanceType.SINGLE_TABLE) @org.hibernate.annotations.DiscriminatorFormula( "case when CC_NUMBER is not null then 'CC' else 'BA' end" ) public abstract class BillingDetails { ... }
The disadvantages of the table per class hierarchy strategy may be too serious for your design—after all, denormalized schemas can become a major burden in the long run. Your DBA may not like it at all. The next inheritance mapping strategy doesn’t expose you to this problem.
5.1.4
Table per subclass
The fourth option is to represent inheritance relationships as relational foreign key associations. Every class/subclass that declares persistent properties—including abstract classes and even interfaces—has its own table. Unlike the table per concrete class strategy we mapped first, the table here contains columns only for each noninherited property (each property declared by the subclass itself) along with a primary key that is also a foreign key of the superclass table. This approach is shown in figure 5.3. If an instance of the CreditCard subclass is made persistent, the values of properties declared by the BillingDetails superclass are persisted to a new row of the BILLING_DETAILS table. Only the values of properties declared by the subclass are persisted to a new row of the CREDIT_CARD table. The two rows are linked together
204
CHAPTER 5
Inheritance and custom types
Figure 5.3
Mapping all classes of the hierarchy to their own table
by their shared primary key value. Later, the subclass instance may be retrieved from the database by joining the subclass table with the superclass table. The primary advantage of this strategy is that the SQL schema is normalized. Schema evolution and integrity constraint definition are straightforward. A polymorphic association to a particular subclass may be represented as a foreign key referencing the table of that particular subclass. In Hibernate, you use the element to create a table per subclass mapping. See listing 5.3.
Listing 5.3 Hibernate mapping
B
Mapping class inheritance
205
...
C D
...
B C D
The root class BillingDetails is mapped to the table BILLING_DETAILS. Note that no discriminator is required with this strategy. The new element maps a subclass to a new table—in this example, CREDIT_CARD. All properties declared in the joined subclass are mapped to this table. A primary key is required for the CREDIT_CARD table. This column also has a foreign key constraint to the primary key of the BILLING_DETAILS table. A CreditCard object lookup requires a join of both tables. A element may contain other nested elements, until the whole hierarchy has been mapped. Hibernate relies on an outer join when querying the BillingDetails class:
select BD.BILLING_DETAILS_ID, BD.OWNER, CC.NUMBER, CC.EXP_MONTH, ..., BA.ACCOUNT, BA.BANKNAME, ... case when CC.CREDIT_CARD_ID is not null then 1 when BA.BANK_ACCOUNT_ID is not null then 2 when BD.BILLING_DETAILS_ID is not null then 0 end as CLAZZ_
206
CHAPTER 5
Inheritance and custom types
from BILLING_DETAILS BD left join CREDIT_CARD CC on BD.BILLING_DETAILS_ID = CC.CREDIT_CARD_ID left join BANK_ACCOUNT BA on BD.BILLING_DETAILS_ID = BA.BANK_ACCOUNT_ID
The SQL CASE statement detects the existence (or absence) of rows in the subclass tables CREDIT_CARD and BANK_ACCOUNT, so Hibernate can determine the concrete subclass for a particular row of the BILLING_DETAILS table. To narrow the query to the subclass, Hibernate uses an inner join:
select BD.BILLING_DETAILS_ID, BD.OWNER, CC.NUMBER, ... from CREDIT_CARD CC inner join BILLING_DETAILS BD on BD.BILLING_DETAILS_ID = CC.CREDIT_CARD_ID
As you can see, this mapping strategy is more difficult to implement by hand— even ad-hoc reporting is more complex. This is an important consideration if you plan to mix Hibernate code with handwritten SQL. Furthermore, even though this mapping strategy is deceptively simple, our experience is that performance can be unacceptable for complex class hierarchies. Queries always require either a join across many tables or many sequential reads. Let’s map the hierarchy with the same strategy and annotations, here called the JOINED strategy:
@Entity @Inheritance(strategy = InheritanceType.JOINED) public abstract class BillingDetails { @Id @GeneratedValue @Column(name = "BILLING_DETAILS_ID") private Long id = null; ... }
In subclasses, you don’t need to specify the join column if the primary key column of the subclass table has (or is supposed to have) the same name as the primary key column of the superclass table:
@Entity public class BankAccount { ... }
This entity has no identifier property; it automatically inherits the BILLING_ DETAILS_ID property and column from the superclass, and Hibernate knows how
Mapping class inheritance
207
to join the tables together if you want to retrieve instances of BankAccount. Of course, you can specify the column name explicitly:
@Entity @PrimaryKeyJoinColumn(name = "CREDIT_CARD_ID") public class CreditCard { ... }
Finally, this is the equivalent mapping in JPA XML descriptors:
...
Before we show you when to choose which strategy, let’s consider mixing inheritance mapping strategies in a single class hierarchy.
5.1.5
Mixing inheritance strategies
You can map whole inheritance hierarchies by nesting , , and mapping elements. You can’t mix them—for example, to switch from a table-per-class hierarchy with a discriminator to a normalized table-per-subclass strategy. Once you’ve made a decision for an inheritance strategy, you have to stick to it. This isn’t completely true, however. With some Hibernate tricks, you can switch the mapping strategy for a particular subclass. For example, you can map a class hierarchy to a single table, but for a particular subclass, switch to a separate table with a foreign key mapping strategy, just as with table per subclass. This is possible with the mapping element:
...
208
CHAPTER 5
Inheritance and custom types
... ... ... ...
The element groups some properties and tells Hibernate to get them from a secondary table. This mapping element has many uses, and you’ll see it again later in the book. In this example, it separates the CreditCard properties from the table per hierarchy into the CREDIT_CARD table. The CREDIT_CARD_ID column of this table is at the same time the primary key, and it has a foreign key constraint referencing the BILLING_DETAILS_ID of the hierarchy table. The BankAccount subclass is mapped to the hierarchy table. Look at the schema in figure 5.4. At runtime, Hibernate executes an outer join to fetch BillingDetails and all subclass instances polymorphically:
select BILLING_DETAILS_ID, BILLING_DETAILS_TYPE, OWNER, CC.CC_NUMBER, CC.CC_EXP_MONTH, CC.CC_EXP_YEAR, BA_ACCOUNT, BA_BANKNAME, BA_SWIFT from BILLING_DETAILS left outer join CREDIT_CARD CC on BILLING_DETAILS_ID = CC.CREDIT_CARD_ID
Mapping class inheritance
209
Figure 5.4
Breaking out a subclass to its own secondary table
You can also use the trick for other subclasses in your class hierarchy. However, if you have an exceptionally wide class hierarchy, the outer join can become a problem. Some database systems (Oracle, for example) limit the number of tables in an outer join operation. For a wide hierarchy, you may want to switch to a different fetching strategy that executes an immediate second select instead of an outer join:
...
Java Persistence also supports this mixed inheritance mapping strategy with annotations. Map the superclass BillingDetails with InheritanceType.SINGLE_ TABLE, as you did before. Now map the subclass you want to break out of the single table to a secondary table.
@Entity @DiscriminatorValue("CC") @SecondaryTable(
210
CHAPTER 5
Inheritance and custom types
name = "CREDIT_CARD", pkJoinColumns = @PrimaryKeyJoinColumn(name = "CREDIT_CARD_ID") ) public class CreditCard extends BillingDetails { @Column(table = "CREDIT_CARD", name = "CC_NUMBER", nullable = false) private String number; ... }
If you don’t specify a primary key join column for the secondary table, the name of the primary key of the single inheritance table is used—in this case, BILLING_DETAILS_ID. Also note that you need to map all properties that are moved into the secondary table with the name of that secondary table. You also want more tips about how to choose an appropriate combination of mapping strategies for your application’s class hierarchies.
5.1.6
Choosing a strategy
You can apply all mapping strategies to abstract classes and interfaces. Interfaces may have no state but may contain accessor method declarations, so they can be treated like abstract classes. You can map an interface with , , , or , and you can map any declared or inherited property with . Hibernate won’t try to instantiate an abstract class, even if you query or load it.
NOTE
Note that the JPA specification doesn’t support any mapping annotation on an interface! This will be resolved in a future version of the specification; when you read this book, it will probably be possible with Hibernate Annotations.
Here are some rules of thumb:
■
If you don’t require polymorphic associations or queries, lean toward tableper-concrete-class—in other words, if you never or rarely query for BillingDetails and you have no class that has an association to BillingDetails (our model has). An explicit UNION-based mapping should be preferred, because (optimized) polymorphic queries and associations will then be possible later. Implicit polymorphism is mostly useful for queries utilizing non-persistence-related interfaces. If you do require polymorphic associations (an association to a superclass, hence to all classes in the hierarchy with dynamic resolution of the concrete
■
Mapping class inheritance
211
class at runtime) or queries, and subclasses declare relatively few properties (particularly if the main difference between subclasses is in their behavior), lean toward table-per-class-hierarchy. Your goal is to minimize the number of nullable columns and to convince yourself (and your DBA) that a denormalized schema won’t create problems in the long run.
■
If you do require polymorphic associations or queries, and subclasses declare many properties (subclasses differ mainly by the data they hold), lean toward table-per-subclass. Or, depending on the width and depth of your inheritance hierarchy and the possible cost of joins versus unions, use table-per-concrete-class.
By default, choose table-per-class-hierarchy only for simple problems. For more complex cases (or when you’re overruled by a data modeler insisting on the importance of nullability constraints and normalization), you should consider the table-per-subclass strategy. But at that point, ask yourself whether it may not be better to remodel inheritance as delegation in the object model. Complex inheritance is often best avoided for all sorts of reasons unrelated to persistence or ORM. Hibernate acts as a buffer between the domain and relational models, but that doesn’t mean you can ignore persistence concerns when designing your classes. When you start thinking about mixing inheritance strategies, remember that implicit polymorphism in Hibernate is smart enough to handle more exotic cases. For example, consider an additional interface in our application, ElectronicPaymentOption. This is a business interface that doesn’t have a persistence aspect—except that in our application, a persistent class such as CreditCard will likely implement this interface. No matter how you map the BillingDetails hierarchy, Hibernate can answer a query from ElectronicPaymentOption correctly. This even works if other classes, which aren’t part of the BillingDetails hierarchy, are mapped persistent and implement this interface. Hibernate always know what tables to query, which instances to construct, and how to return a polymorphic result. Finally, you can also use , , and mapping elements in a separate mapping file (as a top-level element instead of ). You then have to declare the class that is extended, such as , and the superclass mapping must be loaded programmatically before the subclass mapping file (you don’t have to worry about this order when you list mapping resources in the XML configuration file). This technique allows you to extend a class hierarchy without modifying the mapping file of the superclass.
212
CHAPTER 5
Inheritance and custom types
You now know everything you need to know about the mapping of entities, properties, and inheritance hierarchies. You can already map complex domain models. In the second half of this chapter, we discuss another important feature that you should know by heart as a Hibernate user: the Hibernate mapping type system.
5.2
The Hibernate type system
In chapter 4, we first distinguished between entity and value types—a central concept of ORM in Java. We must elaborate on that distinction in order for you to fully understand the Hibernate type system of entities, value types, and mapping types.
5.2.1
Recapitulating entity and value types
Entities are the coarse-grained classes in your system. You usually define the features of a system in terms of the entities involved. The user places a bid for an item is a typical feature definition; it mentions three entities. Classes of value types often don’t even appear in the business requirements—they’re usually the fine-grained classes representing strings, numbers, and monetary amounts. Occasionally, value types do appear in feature definitions: the user changes billing address is one example, assuming that Address is a value type. More formally, an entity is any class whose instances have their own persistent identity. A value type is a class that doesn’t define some kind of persistent identity. In practice, this means that entity types are classes with identifier properties, and value type classes depend on an entity. At runtime, you have a network of entity instances interleaved with value type instances. The entity instances may be in any of the three persistent lifecycle states: transient, detached, or persistent. We don’t consider these lifecycle states to apply to the value type instances. (We’ll come back to this discussion of object states in chapter 9.) Therefore, entities have their own lifecycle. The save() and delete() methods of the Hibernate Session interface apply to instances of entity classes, never to value type instances. The persistence lifecycle of a value type instance is completely tied to the lifecycle of the owning entity instance. For example, the username becomes persistent when the user is saved; it never becomes persistent independently of the user.
The Hibernate type system
213
In Hibernate, a value type may define associations; it’s possible to navigate from a value type instance to some other entity. However, it’s never possible to navigate from the other entity back to the value type instance. Associations always point to entities. This means that a value type instance is owned by exactly one entity when it’s retrieved from the database; it’s never shared. At the level of the database, any table is considered an entity. However, Hibernate provides certain constructs to hide the existence of a database-level entity from the Java code. For example, a many-to-many association mapping hides the intermediate association table from the application. A collection of strings (more accurately, a collection of value-typed instances) behaves like a value type from the point of view of the application; however, it’s mapped to its own table. Although these features seem nice at first (they simplify the Java code), we have over time become suspicious of them. Inevitably, these hidden entities end up needing to be exposed to the application as business requirements evolve. The many-to-many association table, for example, often has additional columns added as the application matures. We’re almost prepared to recommend that every database-level entity be exposed to the application as an entity class. For example, we would be inclined to model the many-to-many association as two one-to-many associations to an intervening entity class. We’ll leave the final decision to you, however, and come back to the topic of many-to-many entity associations in the future chapters. Entity classes are always mapped to the database using , , , and mapping elements. How are value types mapped? You’ve already met two different kinds of value type mappings: and . The value type of a component is obvious: It’s the class that is mapped as embeddable. However, the type of a property is a more generic notion. Consider this mapping of the CaveatEmptor User and email address:
Let’s focus on that type="string" attribute. You know that in ORM you have to deal with Java types and SQL datatypes. The two different type systems must be bridged. This is the job of the Hibernate mapping types, and string is the name of a built-in Hibernate mapping type.
214
CHAPTER 5
Inheritance and custom types
The string mapping type isn’t the only one built into Hibernate. Hibernate comes with various mapping types that define default persistence strategies for primitive Java types and certain JDK classes.
5.2.2
Built-in mapping types
Hibernate’s built-in mapping types usually share the name of the Java type they map. However, there may be more than one Hibernate mapping type for a particular Java type. The built-in types may not be used to perform arbitrary conversions, such as mapping a VARCHAR database value to a Java Integer property value. You may define your own custom value types for this kind of conversation, as shown later in this chapter. We now discuss the basic, date and time, locator object, and various other built-in mapping types and show you what Java and SQL datatype they handle. Java primitive mapping types The basic mapping types in table 5.1 map Java primitive types (or their wrapper types) to appropriate built-in SQL standard types.
Table 5.1 Primitive types Java type Standard SQL built-in type
Mapping type
integer long short float double big_decimal character string byte boolean yes_no true_false
int or java.lang.Integer long or java.lang.Long short or java.lang.Short float or java.lang.Float double or java.lang.Double java.math.BigDecimal java.lang.String java.lang.String byte or java.lang.Byte boolean or java.lang.Boolean boolean or java.lang.Boolean boolean or java.lang.Boolean
INTEGER BIGINT SMALLINT FLOAT DOUBLE NUMERIC CHAR(1) VARCHAR TINYINT BIT CHAR(1) ('Y' or 'N') CHAR(1) ('T' or 'F')
The Hibernate type system
215
You’ve probably noticed that your database doesn’t support some of the SQL types mentioned in table 5.1. The listed type names are names of ANSI-standard datatypes. Most database vendors ignore this part of the SQL standard (because their legacy type systems often predate the standard). However, the JDBC driver provides a partial abstraction of vendor-specific SQL datatypes, allowing Hibernate to work with ANSI-standard types when executing DML. For database-specific DDL generation, Hibernate translates from the ANSI-standard type to an appropriate vendor-specific type, using the built-in support for specific SQL dialects. (This means you usually don’t have to worry about SQL datatypes if you’re using Hibernate for data access and SQL schema definition.) Furthermore, the Hibernate type system is smart and can switch SQL datatypes depending on the defined length of a value. The most obvious case is string: If you declare a string property mapping with a length attribute, Hibernate picks the correct SQL datatype depending on the selected dialect. For MySQL, for example, a length of up to 65535 results in a regular VARCHAR(length) column when Hibernate exports the schema. For a length of up to 16777215, a MEDIUMTEXT datatype is used. Larger string mappings result in a LONGTEXT. Check your SQL dialect (the source code comes with Hibernate) if you want to know the ranges for this and other mapping types. You can customize this behavior by subclassing your dialect and overriding these settings. Most dialects also support setting the scale and precision of decimal SQL datatypes. For example, a precision or scale setting in your mapping of a BigDecimal creates a NUMERIC(precision, scale) datatype for MySQL. Finally, the yes_no and true_false mapping types are converters that are mostly useful for legacy schemas and Oracle users; Oracle DBMS products don’t have a built-in boolean or truth-valued type (the only built-in datatype actually required by the relational data model). Date and time mapping types Table 5.2 lists Hibernate types associated with dates, times, and timestamps. In your domain model, you may choose to represent date and time data using java.util.Date, java.util.Calendar, or the subclasses of java.util.Date defined in the java.sql package. This is a matter of taste, and we leave the decision to you—make sure you’re consistent, however. (In practice, binding your domain model to types from the JDBC package isn’t the best idea.) A caveat: If you map a java.util.Date property with timestamp (the most common case), Hibernate returns a java.sql.Timestamp when loading the property from the database. Hibernate has to use the JDBC subclass because it includes
216
CHAPTER 5
Inheritance and custom types
Table 5.2 Date and time types Java type Standard SQL built-in type
Mapping type
date time timestamp calendar calendar_date
java.util.Date or java.sql.Date java.util.Date or java.sql.Time java.util.Date or java.sql.Timestamp java.util.Calendar java.util.Calendar
DATE TIME TIMESTAMP TIMESTAMP DATE
nanosecond information that may be present in the database. Hibernate can’t just cut off this information. This can lead to problems if you try to compare your java.util.Date properties with the equals() method, because it isn’t symmetric with the java.sql.Timestamp subclass equals() method. First, the right way (in any case) to compare two java.util.Date objects, which also works for any subclass, is aDate.getTime() > bDate.getTime() (for a greater-than comparison). Second, you can write a custom mapping type that cuts off the database nanosecond information and returns a java.util.Date in all cases. Currently (although this may change in the future), no such mapping type is built into Hibernate. Binary and large value mapping types Table 5.3 lists Hibernate types for handling binary data and large values. Note that only binary is supported as the type of an identifier property. If a property in your persistent Java class is of type byte[], Hibernate can map it to a VARBINARY column with the binary mapping type. (Note that the real SQL
Table 5.3 Binary and large value types Java type Standard SQL built-in type
Mapping type
binary text clob blob serializable
byte[] java.lang.String java.sql.Clob java.sql.Blob
Any Java class that implements
VARBINARY CLOB CLOB BLOB VARBINARY
java.io.Serializable
The Hibernate type system
217
type depends on the dialect; for example, in PostgreSQL, the SQL type is BYTEA, and in Oracle it’s RAW.) If a property in your persistent Java class is of type java.lang.String, Hibernate can map it to an SQL CLOB column, with the text mapping type. Note that in both cases, Hibernate initializes the property value right away, when the entity instance that holds the property variable is loaded. This is inconvenient when you have to deal with potentially large values. One solution is lazy loading through interception of field access, on demand. However, this approach requires bytecode instrumentation of your persistent classes for the injection of extra code. We’ll discuss lazy loading through bytecode instrumentation and interception in chapter 13, section 13.1.6, “Lazy loading with interception.” A second solution is a different kind of property in your Java class. JDBC supports locator objects (LOBs) directly.1 If your Java property is of type java.sql.Clob or java.sql.Blob, you can map it with the clob or blob mapping type to get lazy loading of large values without bytecode instrumentation. When the owner of the property is loaded, the property value is a locator object—effectively, a pointer to the real value that isn’t yet materialized. Once you access the property, the value is materialized. This on-demand loading works only as long as the database transaction is open, so you need to access any property of such a type when the owning entity instance is in a persistent and transactional state, not in detached state. Your domain model is now also bound to JDBC, because the import of the java.sql package is required. Although domain model classes are executable in isolated unit tests, you can’t access LOB properties without a database connection. Mapping properties with potentially large values is slightly different if you rely on Java Persistence annotations. By default, a property of type java.lang.String is mapped to an SQL VARCHAR column (or equivalent, depending on the SQL dialect). If you want to map a java.lang.String, char[], Character[], or even a java.sql.Clob typed property to a CLOB column, you need to map it with the @Lob annotation:
@Lob @Column(name = "ITEM_DESCRIPTION") private String description;
1
Jim Starkey, who came up with the idea of LOBs, says that the terms BLOB and CLOB don’t mean anything but were created by the marketing department. You can interpret them any way you like. We prefer locator objects, as a hint that they work like pointers.
218
CHAPTER 5
Inheritance and custom types
@Lob @Column(name = "ITEM_IMAGE") private byte[] image;
The same is true for any property that is of type byte[], Byte[], or java. sql.Blob. Note that for all cases, except properties that are of java.sql.Clob or java.sql.Blob type, the values are again loaded immediately by Hibernate, and not lazily on demand. Instrumenting bytecode with interception code is again an option to enable lazy loading of individual properties transparently. To create and set a java.sql.Blob or java.sql.Clob value, if you have these property types in your domain model, use the static Hibernate.createBlob() and Hibernate.createClob() methods and provide a byte array, an input stream, or a string. Finally, note that both Hibernate and JPA provide a serialization fallback for any property type that is Serializable. This mapping type converts the value of a property to a byte stream that is then stored in a VARBINARY (or equivalent) column. When the owner of the property is loaded, the property value is deserialized. Naturally, you should use this strategy with extreme caution (data lives longer than an application), and it may be useful only for temporary data (user preferences, login session data, and so on). JDK mapping types Table 5.4 lists Hibernate types for various other Java types of the JDK that may be represented as a VARCHAR in the database. You may have noticed that isn’t the only Hibernate mapping element that has a type attribute.
Table 5.4 Other JDK-related types Java type Standard SQL built-in type
Mapping type
class locale timezone currency
java.lang.Class java.util.Locale java.util.TimeZone java.util.Currency
VARCHAR VARCHAR VARCHAR VARCHAR
The Hibernate type system
219
5.2.3
Using mapping types
All of the basic mapping types may appear almost anywhere in the Hibernate mapping document, on normal property, identifier property, and other mapping elements. The , , , , and elements all define an attribute named type. You can see how useful the built-in mapping types are in this mapping for the BillingDetails class:
....
The BillingDetails class is mapped as an entity. Its discriminator, identifier, and name properties are value typed, and we use the built-in Hibernate mapping types to specify the conversion strategy. It isn’t often necessary to explicitly specify a built-in mapping type in the XML mapping document. For instance, if you have a property of Java type java.lang.String, Hibernate discovers this using reflection and selects string by default. We can easily simplify the previous mapping example:
....
Hibernate also understands type="java.lang.String"; it doesn’t have to use reflection then. The most important case where this approach doesn’t work well is a java.util.Date property. By default, Hibernate interprets a java.util.Date as a timestamp mapping. You need to explicitly specify type="time" or type="date" if you don’t wish to persist both date and time information. With JPA annotations, the mapping type of a property is automatically detected, just like in Hibernate. For a java.util.Date or java.util.Calendar property, the Java Persistence standard requires that you select the precision with a @Temporal annotation:
220
CHAPTER 5
Inheritance and custom types
@Temporal(TemporalType.TIMESTAMP) @Column(nullable = false, updatable = false) private Date startDate;
On the other hand, Hibernate Annotations, relaxing the rules of the standard, defaults to TemporalType.TIMESTAMP—options are TemporalType.TIME and TemporalType.DATE. In other rare cases, you may want to add the @org.hibernate.annotations.Type annotation to a property and declare the name of a built-in or custom Hibernate mapping type explicitly. This is a much more common extension as soon as you start writing your own custom mapping types, which you’ll do later in this chapter. The equivalent JPA XML descriptor is as follows:
... TIMESTAMP
For each of the built-in mapping types, a constant is defined by the class org.hibernate.Hibernate. For example, Hibernate.STRING represents the string mapping type. These constants are useful for query parameter binding, as discussed in more detail in chapters 14 and 15:
session.createQuery("from Item i where i.description like :desc") .setParameter("desc", d, Hibernate.STRING) .list();
Note that you may as well use the setString() argument binding method in this case. Type constants are also useful for programmatic manipulation of the Hibernate mapping metamodel, as discussed in chapter 3. Hibernate isn’t limited to the built-in mapping types. We consider the extensible mapping-type system one of the core features and an important aspect that makes Hibernate so flexible.
5.3
Creating custom mapping types
Object-oriented languages like Java make it easy to define new types by writing new classes. This is a fundamental part of the definition of object-orientation. If we were then limited to the predefined built-in Hibernate mapping types when
Creating custom mapping types
221
declaring properties of our persistent classes, we would lose much of Java’s expressiveness. Furthermore, our domain model implementation would be tightly coupled to the physical data model, because new type conversions would be impossible. Most ORM solutions that we have seen provide support for user-defined strategies for performing type conversions. These are often called converters. For example, the user can create a new strategy for persisting a property of JDK type Integer to a VARCHAR column. Hibernate provides a similar, much more powerful, feature called custom mapping types. First you need to understand when it’s appropriate to write your own custom mapping type, and which Hibernate extension point is relevant for you. We’ll then write some custom mapping types and explore the options.
5.3.1
Considering custom mapping types
As an example, take the mapping of the Address class from previous chapters, as a component:
This value type mapping is straightforward; all properties of the new user-defined Java type are mapped to individual columns of a built-in SQL datatype. However, you can alternatively map it as a simple property, with a custom mapping type:
This is also probably the first time you’ve seen a single element with several elements nested inside. We’re moving the responsibility for translating and converting between an Address value type (it isn’t even named anywhere) and the named three columns to a separate class: auction.persistence.CustomAddressType. This class is now responsible for loading and saving this property. Note that no Java code changes in the domain model implementation—the homeAddress property is of type Address.
222
CHAPTER 5
Inheritance and custom types
Granted, the benefit of replacing a component mapping with a custom mapping type is dubious in this case. As long as you require no special conversion when loading and saving this object, the CustomAddressType you now have to write is just additional work. However, you can already see that custom mapping types provide an additional buffer—something that may come in handy in the long run when extra conversion is required. Of course, there are better use cases for custom mapping types, as you’ll soon see. (Many examples of useful Hibernate mapping types can be found on the Hibernate community website.) Let’s look at the Hibernate extension points for the creation of custom mapping types.
5.3.2
The extension points
Hibernate provides several interfaces that applications may use when defining custom mapping types. These interfaces reduce the work involved in creating new mapping types and insulate the custom type from changes to the Hibernate core. This allows you to easily upgrade Hibernate and keep your existing custom mapping types. The extension points are as follows:
■
org.hibernate.usertype.UserType —The basic extension point, which is
useful in many situations. It provides the basic methods for custom loading and storing of value type instances.
■
org.hibernate.usertype.CompositeUserType —An interface with more methods than the basic UserType, used to expose internals about your value
type class to Hibernate, such as the individual properties. You can then refer to these properties in Hibernate queries.
■
org.hibernate.usertype.UserCollectionType —A rarely needed interface that’s used to implement custom collections. A custom mapping type implementing this interface isn’t declared on a property mapping but is useful only for custom collection mappings. You have to implement this type if you want to persist a non-JDK collection and preserve additional semantics persistently. We discuss collection mappings and this extension point in the next chapter. org.hibernate.usertype.EnhancedUserType —An interface that extends UserType and provides additional methods for marshalling value types to
■
and from XML representations, or enables a custom mapping type for use in identifier and discriminator mappings.
Creating custom mapping types
223
■
org.hibernate.usertype.UserVersionType —An interface that extends UserType and provides additional methods enabling the custom mapping
type for usage in entity version mappings.
■
org.hibernate.usertype.ParameterizedType —A useful interface that
can be combined with all others to provide configuration settings—that is, parameters defined in metadata. For example, you can write a single MoneyConverter that knows how to translate values into Euro or US dollars, depending on a parameter in the mapping. We’ll now create some custom mapping types. You shouldn’t consider this an unnecessary exercise, even if you’re happy with the built-in Hibernate mapping types. In our experience, every sophisticated application has many good use cases for custom mapping types.
5.3.3
The case for custom mapping types
The Bid class defines an amount property, and the Item class defines an initialPrice property; both are monetary values. So far, we’ve used only a simple BigDecimal to represent the value, mapped with big_decimal to a single NUMERIC column. Suppose you want to support multiple currencies in the auction application and that you have to refactor the existing domain model for this (customerdriven) change. One way to implement this change would be to add new properties to Bid and Item: amountCurrency and initialPriceCurrency. You could then map these new properties to additional VARCHAR columns with the built-in currency mapping type. We hope you never use this approach! Instead, you should create a new MonetaryAmount class that encapsulates both currency and amount. Note that this is a class of your domain model; it doesn’t have any dependency on Hibernate interfaces:
public class MonetaryAmount implements Serializable { private final BigDecimal amount; private final Currency currency; public MonetaryAmount(BigDecimal amount, Currency currency) { this.amount = amount; this.currency = currency; } public BigDecimal getAmount() { return amount; } public Currency getCurrency() { return currency; }
224
CHAPTER 5
Inheritance and custom types
public boolean equals(Object o) { ... } public int hashCode() { ...} }
We have made MonetaryAmount an immutable class. This is a good practice in Java because it simplifies coding. Note that you have to implement equals() and hashCode() to finish the class (there is nothing special to consider here). You use this new MonetaryAmount to replace the BigDecimal of the initialPrice property in Item. You can and should use it for all other BigDecimal prices in any persistent classes, such as the Bid.amount, and in business logic—for example, in the billing system. Let’s map the refactored initialPrice property of Item, with its new MonetaryAmount type to the database.
5.3.4
Creating a UserType
Imagine that you’re working with a legacy database that represents all monetary amounts in USD. The application is no longer restricted to a single currency (that was the point of the refactoring), but it takes some time for the database team to make the changes. You need to convert the amount to USD when persisting MonetaryAmount objects. When you load from the database, you convert it back to the currency the user selected in his or her preferences. Create a new MonetaryAmountUserType class that implements the Hibernate interface UserType. This is your custom mapping type, shown in listing 5.4.
Listing 5.4 Custom mapping type for monetary amounts in USD
public class MonetaryAmountUserType implements UserType {
B
public int[] sqlTypes() { return new int[]{ Hibernate.BIG_DECIMAL.sqlType() }; } public Class returnedClass() { return MonetaryAmount.class; } public boolean isMutable() { return false; } public Object deepCopy(Object value) {
C E G
D F
return value; }
public Serializable disassemble(Object value) { return (Serializable) value; }
public Object assemble(Serializable cached, Object owner) { return cached; }
H
public Object replace(Object original, Object target,
Creating custom mapping types
225
Object owner) { return original; } public boolean equals(Object x, Object y) { if (x == y) return true; if (x == null || y == null) return false; return x.equals(y); } public int hashCode(Object x) { return x.hashCode(); } public Object nullSafeGet(ResultSet resultSet, String[] names, Object owner) throws SQLException {
I
J
BigDecimal valueInUSD = resultSet.getBigDecimal(names[0]); // Deferred check after first read if (resultSet.wasNull()) return null; Currency userCurrency = User.getPreferences().getCurrency(); MonetaryAmount amount = new MonetaryAmount(valueInUSD, "USD"); return amount.convertTo(userCurrency); } public void nullSafeSet(PreparedStatement statement, Object value, int index) throws HibernateException, SQLException { if (value == null) { statement.setNull(index, Hibernate.BIG_DECIMAL.sqlType()); } else { MonetaryAmount anyCurrency = (MonetaryAmount)value; MonetaryAmount amountInUSD = MonetaryAmount.convert( anyCurrency, Currency.getInstance("USD") ); statement.setBigDecimal(index, amountInUSD.getAmount ()); } } }
1)
B
The sqlTypes() method tells Hibernate what SQL column types to use for DDL schema generation. Notice that this method returns an array of type codes. A UserType may map a single property to multiple columns, but this legacy data model has only a single numeric column. By using the Hibernate.BIG_DECIMAL.sqlType() method, you let Hibernate decide the exact SQL
226
CHAPTER 5
Inheritance and custom types
datatype for the given database dialect. Alternatively, return a constant from java.sql.Types.
C D E
The returnedClass() method tells Hibernate what Java value type class is mapped by this UserType. Hibernate can make some minor performance optimizations for immutable types like this one, for example, when comparing snapshots during dirty checking. The isMutable() method tells Hibernate that this type is immutable. The UserType is also partially responsible for creating a snapshot of a value in the first place. Because MonetaryAmount is an immutable class, the deepCopy() method returns its argument. In the case of a mutable type, it would need to return a copy of the argument to be used as the snapshot value. The disassemble() method is called when Hibernate puts a MonetaryAmount into the second-level cache. As you’ll learn later, this is a cache of data that stores information in a serialized form. The assemble() method does the opposite of disassembly: It can transform cached data into an instance of MonetaryAmount. As you can see, implementation of both routines is easy for immutable types. Implement replace() to handle merging of detached object state. As you’ll see later in the book, the process of merging involves an original and a target object, whose state must be combined. Again, for immutable value types, return the first argument. For mutable types, at least return a deep copy of the first argument. For mutable types that have component fields, you probably want to apply a recursive merging routine. The UserType is responsible for dirty checking property values. The equals() method compares the current property value to a previous snapshot and determines whether the property is dirty and must by saved to the database. The hashCode() of two equal value typed instances has to be the same. We usually delegate this method to the actual value type class—in this case, the hashCode() method of the given MonetaryAmount object. The nullSafeGet() method retrieves the property value from the JDBC ResultSet. You can also access the owner of the component if you need it for the conversion. All database values are in USD, so you convert it to the currency the user has currently set in his preferences. (Note that it’s up to you to implement this conversion and preference handling.)
F G H
I
J
Creating custom mapping types
227
1)
The nullSafeSet() method writes the property value to the JDBC PreparedStatement. This method takes whatever currency is set and converts it to a simple BigDecimal USD amount before saving. You now map the initialPrice property of Item as follows:
Note that you place the custom user type into the persistence package; it’s part of the persistence layer of the application, not the domain model or business layer. To use a custom type in annotations, you have to add a Hibernate extension:
@org.hibernate.annotations.Type( type = " persistence.MonetaryAmountUserType" ) @Column(name = "INITIAL_PRICE") private MonetaryAmount initialPrice;
This is the simplest kind of transformation that a UserType can perform. Much more sophisticated things are possible. A custom mapping type can perform validation; it can read and write data to and from an LDAP directory; it can even retrieve persistent objects from a different database. You’re limited mainly by your imagination. In reality, we’d prefer to represent both the amount and currency of monetary amounts in the database, especially if the schema isn’t legacy but can be defined (or updated quickly). Let’s assume you now have two columns available and can store the MonetaryAmount without much conversion. A first option may again be a simple mapping. However, let’s try to solve it with a custom mapping type. (Instead of writing a new custom type, try to adapt the previous example for two columns. You can do this without changing the Java domain model classes— only the converter needs to be updated for this new requirement and the additional column named in the mapping.) The disadvantage of a simple UserType implementation is that Hibernate doesn’t know anything about the individual properties inside a MonetaryAmount. All it knows is the custom type class and the column names. The Hibernate query engine (discussed in more detail later) doesn’t know how to query for amount or a particular currency.
228
CHAPTER 5
Inheritance and custom types
You write a CompositeUserType if you need the full power of Hibernate queries. This (slightly more complex) interface exposes the properties of the MonetaryAmount to Hibernate queries. We’ll now map it again with this more flexible customization interface to two columns, effectively producing an equivalent to a component mapping.
5.3.5
Creating a CompositeUserType
To demonstrate the flexibility of custom mappings types, you don’t change the MonetaryAmount class (and other persistent classes) at all—you change only the custom mapping type, as shown in listing 5.5.
Listing 5.5 Custom mapping type for monetary amounts in new database schemas
public class MonetaryAmountCompositeUserType implements CompositeUserType {
B
// public int[] sqlTypes()... public Class returnedClass... public boolean isMutable... public Object deepCopy... public Serializable disassemble... public Object assemble... public Object replace... public boolean equals... public int hashCode...
C
public Object nullSafeGet(ResultSet resultSet, String[] names, SessionImplementor session, Object owner) throws SQLException { BigDecimal value = resultSet.getBigDecimal( names[0] ); if (resultSet.wasNull()) return null; Currency currency = Currency.getInstance(resultSet.getString( names[1] ) ); return new MonetaryAmount(value, currency); } public void nullSafeSet(PreparedStatement statement, Object value, int index, SessionImplementor session) throws SQLException { if (value==null) { statement.setNull(index, Hibernate.BIG_DECIMAL.sqlType()); statement.setNull(index+1, Hibernate.CURRENCY.sqlType()); } else {
D
Creating custom mapping types
229
MonetaryAmount amount = (MonetaryAmount) value; String currencyCode = amount.getCurrency().getCurrencyCode(); statement.setBigDecimal( index, amount.getAmount() ); statement.setString( index+1, currencyCode ); } }
E F
public String[] getPropertyNames() { return new String[] { "amount", "currency" }; } public Type[] getPropertyTypes() { return new Type[] { Hibernate.BIG_DECIMAL, Hibernate.CURRENCY }; } public Object getPropertyValue(Object component, int property) { MonetaryAmount monetaryAmount = (MonetaryAmount) component; if (property == 0) return monetaryAmount.getAmount(); else return monetaryAmount.getCurrency(); }
G
H
public void setPropertyValue(Object component, int property, Object value) { throw new UnsupportedOperationException("Immutable MonetaryAmount!"); } }
B C D E F G
The CompositeUserType interface requires the same housekeeping methods as the UserType you created earlier. However, the sqlTypes() method is no longer needed. Loading a value now is straightforward: You transform two column values in the result set to two property values in a new MonetaryAmount instance. Saving a value involves setting two parameters on the prepared statement. A CompositeUserType exposes the properties of the value type through getPropertyNames(). The properties each have their own type, as defined by getPropertyTypes(). The types of the SQL columns are now implicit from this method. The getPropertyValue() method returns the value of an individual property of the MonetaryAmount.
230
CHAPTER 5
Inheritance and custom types
H
The setPropertyValue() method sets the value of an individual property of the MonetaryAmount. The initialPrice property now maps to two columns, so you need to declare both in the mapping file. The first column stores the value; the second stores the currency of the MonetaryAmount:
If Item is mapped with annotations, you have to declare several columns for this property. You can’t use the javax.persistence.Column annotation several times, so a new, Hibernate-specific annotation is needed:
@org.hibernate.annotations.Type( type = "persistence.MonetaryAmountUserType" ) @org.hibernate.annotations.Columns(columns = { @Column(name="INITIAL_PRICE"), @Column(name="INITIAL_PRICE_CURRENCY", length = 2) }) private MonetaryAmount initialPrice;
In a Hibernate query, you can now refer to the amount and currency properties of the custom type, even though they don’t appear anywhere in the mapping document as individual properties:
from Item i where i.initialPrice.amount > 100.0 and i.initialPrice.currency = 'AUD'
You have extended the buffer between the Java object model and the SQL database schema with the new custom composite type. Both representations are now more robust to changes. Note that the number of columns isn’t relevant for your choice of UserType versus CompositeUserType—only your desire to expose value type properties for Hibernate queries. Parameterization is a helpful feature for all custom mapping types.
5.3.6
Parameterizing custom types
Let’s assume that you face the initial problem again: conversion of money to a different currency when storing it to the database. Often, problems are more subtle than a generic conversion; for example, you may store US dollars in some tables
Creating custom mapping types
231
and Euros in others. You still want to write a single custom mapping type for this, which can do arbitrary conversions. This is possible if you add the ParameterizedType interface to your UserType or CompositeUserType classes:
public class MonetaryAmountConversionType implements UserType, ParameterizedType { // Configuration parameter private Currency convertTo; public void setParameterValues(Properties parameters) { this.convertTo = Currency.getInstance( parameters.getProperty("convertTo") ); } // ... Housekeeping methods public Object nullSafeGet(ResultSet resultSet, String[] names, SessionImplementor session, Object owner) throws SQLException { BigDecimal value = resultSet.getBigDecimal( names[0] ); if (resultSet.wasNull()) return null; // When loading, take the currency from the database Currency currency = Currency.getInstance( resultSet.getString( names[1] ) ); return new MonetaryAmount(value, currency); } public void nullSafeSet(PreparedStatement statement, Object value, int index, SessionImplementor session) throws SQLException { if (value==null) { statement.setNull(index, Types.NUMERIC); } else { MonetaryAmount amount = (MonetaryAmount) value; // When storing, convert the amount to the // currency this converter was parameterized with MonetaryAmount dbAmount = MonetaryAmount.convert(amount, convertTo); statement.setBigDecimal( index, dbAmount.getAmount() ); statement.setString( index+1, dbAmount.getCurrencyCode() ); } } }
232
CHAPTER 5
Inheritance and custom types
We left out the usual mandatory housekeeping methods in this example. The important additional method is setParameterValues() of the ParameterizedType interface. Hibernate calls this method on startup to initialize this class with a convertTo parameter. The nullSafeSet() methods uses this setting to convert to the target currency when saving a MonetaryAmount. The nullSafeGet() method takes the currency that is present in the database and leaves it to the client to deal with the currency of a loaded MonetaryAmount (this asymmetric implementation isn’t the best idea, naturally). You now have to set the configuration parameters in your mapping file when you apply the custom mapping type. A simple solution is the nested mapping on a property:
USD
However, this is inconvenient and requires duplication if you have many monetary amounts in your domain model. A better strategy uses a separate definition of the type, including all parameters, under a unique name that you can then reuse across all your mappings. You do this with a separate , an element (you can also use it without parameters):
USD EUR
What we show here is a binding of a custom mapping type with some arguments to the names monetary_amount_usd and monetary_amount_eur. This definition can be placed anywhere in your mapping files; it’s a child element of (as mentioned earlier in the book, larger applications have often one or several MyCustomTypes.hbm.xml files with no class mappings). With Hibernate extensions, you can define named custom types with parameters in annotations:
Creating custom mapping types
233
@org.hibernate.annotations.TypeDefs({ @org.hibernate.annotations.TypeDef( name="monetary_amount_usd", typeClass = persistence.MonetaryAmountConversionType.class, parameters = { @Parameter(name="convertTo", value="USD") } ), @org.hibernate.annotations.TypeDef( name="monetary_amount_eur", typeClass = persistence.MonetaryAmountConversionType.class, parameters = { @Parameter(name="convertTo", value="EUR") } ) })
This annotation metadata is global as well, so it can be placed outside any Java class declaration (right after the import statements) or in a separate file, package-info.java, as discussed in chapter 2, section 2.2.1, “Using Hibernate Annotations.” A good location in this system is in a package-info.java file in the persistence package. In XML mapping files and annotation mappings, you now refer to the defined type name instead of the fully qualified class name of your custom type:
@org.hibernate.annotations.Type(type = "monetary_amount_eur") @org.hibernate.annotations.Columns({ @Column(name = "BID_AMOUNT"), @Column(name = "BID_AMOUNT_CUR") }) private MonetaryAmount bidAmount;
Let’s look at a different, extremely important, application of custom mapping types. The type-safe enumeration design pattern can be found in almost all applications.
5.3.7
Mapping enumerations
An enumeration type is a common Java idiom where a class has a constant (small) number of immutable instances. In CaveatEmptor, this can be applied to credit cards: for example, to express the possible types a user can enter and the application offers (Mastercard, Visa, and so on). Or, you can enumerate the possible ratings a user can submit in a Comment, about a particular auction.
234
CHAPTER 5
Inheritance and custom types
In older JDKs, you had to implement such classes (let’s call them CreditCardType and Rating) yourself, following the type-safe enumeration pattern. This is still the right way to do it if you don’t have JDK 5.0; the pattern and compatible custom mapping types can be found on the Hibernate community website. Using enumerations in JDK 5.0 If you use JDK 5.0, you can use the built-in language support for type-safe enumerations. For example, a Rating class looks as follows:
package auction.model; public enum Rating { EXCELLENT, OK, BAD }
The Comment class has a property of this type:
public class Comment { ... private Rating rating; private Item auction; ... }
This is how you use the enumeration in the application code:
Comment goodComment = new Comment(Rating.EXCELLENT, thisAuction);
You now have to persist this Comment instance and its Rating. One approach is to use the actual name of the enumeration and save it to a VARCHAR column in the COMMENTS table. This RATING column will then contain EXCELLENT, OK, or BAD, depending on the Rating given. Let’s write a Hibernate UserType that can load and store VARCHAR-backed enumerations, such as the Rating. Writing a custom enumeration handler Instead of the most basic UserType interface, we now want to show you the EnhancedUserType interface. This interface allows you to work with the Comment entity in XML representation mode, not only as a POJO (see the discussion of data representations in chapter 3, section 3.4, “Alternative entity representation”). Furthermore, the implementation you’ll write can support any VARCHARbacked enumeration, not only Rating, thanks to the additional ParameterizedType interface. Look at the code in listing 5.6.
Creating custom mapping types
235
Listing 5.6
Custom mapping type for string-backed enumerations
public class StringEnumUserType implements EnhancedUserType, ParameterizedType { private Class enumClass;
B
public void setParameterValues(Properties parameters) { String enumClassName = parameters.getProperty("enumClassname"); try { enumClass = ReflectHelper.classForName(enumClassName); } catch (ClassNotFoundException cnfe) { throw new HibernateException("Enum class not found", cnfe); } }
C
public Class returnedClass() { return enumClass; }
D E
public int[] sqlTypes() { return new int[] { Hibernate.STRING.sqlType() }; } public public public public public public public boolean isMutable... Object deepCopy... Serializable disassemble... Object replace... Object assemble... boolean equals... int hashCode...
F
public Object fromXMLString(String xmlValue) { return Enum.valueOf(enumClass, xmlValue); } public String objectToSQLString(Object value) { return '\'' + ( (Enum) value ).name() + '\''; } public String toXMLString(Object value) { return ( (Enum) value ).name(); }
G
public Object nullSafeGet(ResultSet rs, String[] names, Object owner) throws SQLException { String name = rs.getString( names[0] ); return rs.wasNull() ? null : Enum.valueOf(enumClass, name); } public void nullSafeSet(PreparedStatement st,
H
236
CHAPTER 5
Inheritance and custom types
Object value, int index) throws SQLException { if (value == null) { st.setNull(index, Hibernate.STRING.sqlType()); } else { st.setString( index, ( (Enum) value ).name() ); } } }
B C D E F G H
The configuration parameter for this custom mapping type is the name of the enumeration class it’s used for, such as Rating. It’s also the class that is returned from this method. A single VARCHAR column is needed in the database table. You keep it portable by letting Hibernate decide the SQL datatype. These are the usual housekeeping methods for an immutable type. The following three methods are part of the EnhancedUserType and are used for
XML marshalling.
When you’re loading an enumeration, you get its name from the database and create an instance. When you’re saving an enumeration, you store its name. Next, you’ll map the rating property with this new custom type. Mapping enumerations with XML and annotations In the XML mapping, first create a custom type definition:
auction.model.Rating
You can now use the type named rating in the Comment class mapping:
Creating custom mapping types
237
Because ratings are immutable, you map it as update="false" and enable direct field access (no setter method for immutable properties). If other classes besides Comment have a Rating property, use the defined custom mapping type again. The definition and declaration of this custom mapping type in annotations looks the same as the one you did in the previous section. On the other hand, you can rely on the Java Persistence provider to persist enumerations. If you have a property in one of your annotated entity classes of type java.lang.Enum (such as the rating in your Comment), and it isn’t marked as @Transient or transient (the Java keyword), the Hibernate JPA implementation must persist this property out of the box without complaining; it has a built-in type that handles this. This built-in mapping type has to default to a representation of an enumeration in the database. The two common choices are string representation, as you implemented for native Hibernate with a custom type, or ordinal representation. An ordinal representation saves the position of the selected enumeration option: for example, 1 for EXCELLENT, 2 for OK, and 3 for BAD. The database column also defaults to a numeric column. You can change this default enumeration mapping with the Enumerated annotation on your property:
public class Comment { ... @Enumerated(EnumType.STRING) @Column(name = "RATING", nullable = false, updatable = false) private Rating rating; ... }
You’ve now switched to a string-based representation, effectively the same representation your custom type can read and write. You can also use a JPA XML descriptor:
... STRING
You may (rightfully) ask why you have to write your own custom mapping type for enumerations when obviously Hibernate, as a Java Persistence provider, can persist and load enumerations out of the box. The secret is that Hibernate Annotations includes several custom mapping types that implement the behavior defined
238
CHAPTER 5
Inheritance and custom types
by Java Persistence. You could use these custom types in XML mappings; however, they aren’t user friendly (they need many parameters) and weren’t written for that purpose. You can check the source (such as org.hibernate.type.EnumType in Hibernate Annotations) to learn their parameters and decide if you want to use them directly in XML. Querying with custom mapping types One further problem you may run into is using enumerated types in Hibernate queries. For example, consider the following query in HQL that retrieves all comments that are rated “bad”:
Query q = session.createQuery( "from Comment c where c.rating = auction.model.Rating.BAD" );
Although this query works if you persist your enumeration as a string (the query parser uses the enumeration value as a constant), it doesn’t work if you selected ordinal representation. You have to use a bind parameter and set the rating value for the comparison programmatically:
Query q = session.createQuery("from Comment c where c.rating = :rating"); Properties params = new Properties(); params.put("enumClassname", "auction.model.Rating"); q.setParameter("rating", Rating.BAD, Hibernate.custom(StringEnumUserType.class, params) );
The last line in this example uses the static helper method Hibernate.custom() to convert the custom mapping type to a Hibernate Type; this is a simple way to tell Hibernate about your enumeration mapping and how to deal with the Rating.BAD value. Note that you also have to tell Hibernate about any initialization properties the parameterized type may need. Unfortunately, there is no API in Java Persistence for arbitrary and custom query parameters, so you have to fall back to the Hibernate Session API and create a Hibernate Query object. We recommend that you become intimately familiar with the Hibernate type system and that you consider the creation of custom mapping types an essential skill—it will be useful in every application you develop with Hibernate or JPA.
Summary
239
5.4
Summary
In this chapter, you learned how inheritance hierarchies of entities can be mapped to the database with the four basic inheritance mapping strategies: table per concrete class with implicit polymorphism, table per concrete class with unions, table per class hierarchy, and the normalized table per subclass strategy. You’ve seen how these strategies can be mixed for a particular hierarchy and when each strategy is most appropriate. We also elaborated on the Hibernate entity and value type distinction, and how the Hibernate mapping type system works. You used various built-in types and wrote your own custom types by utilizing the Hibernate extension points such as UserType and ParameterizedType. Table 5.5 shows a summary you can use to compare native Hibernate features and Java Persistence.
Table 5.5 Hibernate and JPA comparison chart for chapter 5 Hibernate Core Supports four inheritance mapping strategies. Mixing of inheritance strategies is possible. Java Persistence and EJB 3.0 Four inheritance mapping strategies are standardized; mixing strategies in one hierarchy isn’t considered portable. Only table per class hierarchy and table per subclass are required for JPAcompliant providers. A persistent supertype can be an abstract class; mapped interfaces aren’t considered portable.
A persistent supertype can be an abstract class or an interface (with property accessor methods only). Provides flexible built-in mapping types and converters for value typed properties.
There is automatic detection of mapping types, with standardized override for temporal and enum mapping types. Hibernate extension annotation is used for any custom mapping type declaration. The standard requires built-in types for enumerations, LOBs, and many other value types for which you’d have to write or apply a custom mapping type in native Hibernate.
Powerful extendable type system.
The next chapter introduces collection mappings and discusses how you can handle collections of value typed objects (for example, a collection of Strings) and collections that contain references to entity instances.
Mapping collections and entity associations
This chapter covers
■ ■ ■
Basic collection mapping strategies Mapping collections of value types Mapping a parent/children entity relationship
240
Sets, bags, lists, and maps of value types
241
Two important (and sometimes difficult to understand) topics didn’t appear in the previous chapters: the mapping of collections, and the mapping of associations between entity classes. Most developers new to Hibernate are dealing with collections and entity associations for the first time when they try to map a typical parent/child relationship. But instead of jumping right into the middle, we start this chapter with basic collection mapping concepts and simple examples. After that, you’ll be prepared for the first collection in an entity association—although we’ll come back to more complicated entity association mappings in the next chapter. To get the full picture, we recommend you read both chapters.
6.1
Sets, bags, lists, and maps of value types
An object of value type has no database identity; it belongs to an entity instance, and its persistent state is embedded in the table row of the owning entity—at least, if an entity has a reference to a single instance of a valuetype. If an entity class has a collection of value types (or a collection of references to value-typed instances), you need an additional table, the so-called collection table. Before you map collections of value types to collection tables, remember that value-typed classes don’t have identifiers or identifier properties. The lifespan of a value-type instance is bounded by the lifespan of the owning entity instance. A value type doesn’t support shared references. Java has a rich collection API, so you can choose the collection interface and implementation that best fits your domain model design. Let’s walk through the most common collection mappings. Suppose that sellers in CaveatEmptor are able to attach images to Items. An image is accessible only via the containing item; it doesn’t need to support associations from any other entity in your system. The application manages the collection of images through the Item class, adding and removing elements. An image object has no life outside of the collection; it’s dependent on an Item entity. In this case, it isn’t unreasonable to model the image class as a value type. Next. you need to decide what collection to use.
6.1.1
Selecting a collection interface
The idiom for a collection property in the Java domain model is always the same:
private <> images = new <>(); ... // Getter and setter methods
242
CHAPTER 6
Mapping collections and entity associations
Use an interface to declare the type of the property, not an implementation. Pick a matching implementation, and initialize the collection right away; doing so avoids uninitialized collections (we don’t recommend initializing collections late, in constructors or setter methods). If you work with JDK 5.0, you’ll likely code with the generic versions of the JDK collections. Note that this isn’t a requirement; you can also specify the contents of the collection explicitly in mapping metadata. Here’s a typical generic Set with a type parameter:
private Set images = new HashSet(); ... // Getter and setter methods
Out of the box, Hibernate supports the most important JDK collection interfaces. In other words, it knows how to preserve the semantics of JDK collections, maps, and arrays in a persistent fashion. Each interface has a matching implementation supported by Hibernate, and it’s important that you use the right combination. Hibernate only wraps the collection object you’ve already initialized on declaration of the field (or sometimes replaces it, if it’s not the right one). Without extending Hibernate, you can choose from the following collections:
■
A java.util.Set is mapped with a element. Initialize the collection with a java.util.HashSet. The order of its elements isn’t preserved, and duplicate elements aren’t allowed. This is the most common persistent collection in a typical Hibernate application. A java.util.SortedSet can be mapped with , and the sort attribute can be set to either a comparator or natural ordering for in-memory sorting. Initialize the collection with a java.util.TreeSet instance. A java.util.List can be mapped with , preserving the position of each element with an additional index column in the collection table. Initialize with a java.util.ArrayList. A java.util.Collection can be mapped with or . Java doesn’t have a Bag interface or an implementation; however, java.util. Collection allows bag semantics (possible duplicates, no element order is preserved). Hibernate supports persistent bags (it uses lists internally but ignores the index of the elements). Use a java.util.ArrayList to initialize a bag collection. A java.util.Map can be mapped with